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Preface 


The ever-increasing demand on engineers to lower production costs to withstand global 
competition has prompted engineers to look for rigorous methods of decision mak- 
ing, such as optimization methods, to design and produce products and systems both 
economically and efficiently. Optimization techniques, having reached a degree of 
maturity in recent years, are being used in a wide spectrum of industries, including 
aerospace, automotive, chemical, electrical, construction, and manufacturing industries. 
With rapidly advancing computer technology, computers are becoming more powerful, 
and correspondingly, the size and the complexity of the problems that can be solved 
using optimization techniques are also increasing. Optimization methods, coupled with 
modern tools of computer-aided design, are also being used to enhance the creative 
process of conceptual and detailed design of engineering systems. 

The purpose of this textbook is to present the techniques and applications of engi- 
neering optimization in a comprehensive manner. The style of the prior editions has 
been retained, with the theory, computational aspects, and applications of engineering 
optimization presented with detailed explanations. As in previous editions, essential 
proofs and developments of the various techniques are given in a simple manner 
without sacrificing accuracy. New concepts are illustrated with the help of numerical 
examples. Although most engineering design problems can be solved using nonlin- 
ear programming techniques, there are a variety of engineering applications for which 
other optimization methods, such as linear, geometric, dynamic, integer, and stochastic 
programming techniques, are most suitable. The theory and applications of all these 
techniques are also presented in the book. Some of the recently developed methods of 
optimization, such as genetic algorithms, simulated annealing, particle swarm optimiza- 
tion, ant colony optimization, neural-network-based methods, and fuzzy optimization, 
are also discussed. Favorable reactions and encouragement from professors, students, 
and other users of the book have provided me with the impetus to prepare this fourth 
edition of the book. The following changes have been made from the previous edition: 

• Some less-important sections were condensed or deleted. 

• Some sections were rewritten for better clarity. 

• Some sections were expanded. 

• A new chapter on modern methods of optimization is added. 

• Several examples to illustrate the use of Matlab for the solution of different types 
of optimization problems are given. 


Features 

Each topic in Engineering Optimization: Theory and Practice is self-contained, with all 
concepts explained fully and the derivations presented with complete details. The com- 
putational aspects are emphasized throughout with design examples and problems taken 


XVII 


XV111 


Preface 


from several fields of engineering to make the subject appealing to all branches of 
engineering. A large number of solved examples, review questions, problems, 
project-type problems, figures, and references are included to enhance the presentation 
of the material. 

Specific features of the book include: 

• More than 130 illustrative examples accompanying most topics. 

• More than 480 references to the literature of engineering optimization theory and 
applications. 

• More than 460 review questions to help students in reviewing and testing their 
understanding of the text material. 

• More than 510 problems, with solutions to most problems in the instructor’s 
manual. 

• More than 10 examples to illustrate the use of Matlab for the numerical solution 
of optimization problems. 

• Answers to review questions at the web site of the book, www.wiley.com/rao. 

I used different parts of the book to teach optimum design and engineering opti- 
mization courses at the junior/senior level as well as first-year-graduate -level at Indian 
Institute of Technology, Kanpur, India; Purdue University, West Lafayette, Indiana; and 
University of Miami, Coral Gables, Florida. At University of Miami, I cover Chapters 1 , 
2, 3, 5, 6, and 7 and parts of Chapters 8, 10, 12, and 13 in a dual-level course entitled 
Mechanical System Optimization. In this course, a design project is also assigned to 
each student in which the student identifies, formulates, and solves a practical engineer- 
ing problem of his/her interest by applying or modifying an optimization technique. 
This design project gives the student a feeling for ways that optimization methods work 
in practice. The book can also be used, with some supplementary material, for a sec- 
ond course on engineering optimization or optimum design or structural optimization. 
The relative simplicity with which the various topics are presented makes the book 
useful both to students and to practicing engineers for purposes of self-study. The book 
also serves as a reference source for different engineering optimization applications. 
Although the emphasis of the book is on engineering applications, it would also be use- 
ful to other areas, such as operations research and economics. A knowledge of matrix 
theory and differential calculus is assumed on the part of the reader. 


Contents 

The book consists of fourteen chapters and three appendixes. Chapter 1 provides an 
introduction to engineering optimization and optimum design and an overview of opti- 
mization methods. The concepts of design space, constraint surfaces, and contours of 
objective function are introduced here. In addition, the formulation of various types of 
optimization problems is illustrated through a variety of examples taken from various 
fields of engineering. Chapter 2 reviews the essentials of differential calculus useful 
in finding the maxima and minima of functions of several variables. The methods of 
constrained variation and Lagrange multipliers are presented for solving problems with 
equality constraints. The Kuhn-Tucker conditions for inequality-constrained problems 
are given along with a discussion of convex programming problems. 


Preface xix 


Chapters 3 and 4 deal with the solution of linear programming problems. The 
characteristics of a general linear programming problem and the development of the 
simplex method of solution are given in Chapter 3. Some advanced topics in linear 
programming, such as the revised simplex method, duality theory, the decomposition 
principle, and post-optimality analysis, are discussed in Chapter 4. The extension of 
linear programming to solve quadratic programming problems is also considered in 
Chapter 4. 

Chapters 5-7 deal with the solution of nonlinear programming problems. In 
Chapter 5, numerical methods of finding the optimum solution of a function of a single 
variable are given. Chapter 6 deals with the methods of unconstrained optimization. 
The algorithms for various zeroth-, first-, and second-order techniques are discussed 
along with their computational aspects. Chapter 7 is concerned with the solution of 
nonlinear optimization problems in the presence of inequality and equality constraints. 
Both the direct and indirect methods of optimization are discussed. The methods 
presented in this chapter can be treated as the most general techniques for the solution 
of any optimization problem. 

Chapter 8 presents the techniques of geometric programming. The solution tech- 
niques for problems of mixed inequality constraints and complementary geometric 
programming are also considered. In Chapter 9, computational procedures for solving 
discrete and continuous dynamic programming problems are presented. The problem 
of dimensionality is also discussed. Chapter 10 introduces integer programming and 
gives several algorithms for solving integer and discrete linear and nonlinear optimiza- 
tion problems. Chapter 1 1 reviews the basic probability theory and presents techniques 
of stochastic linear, nonlinear, and geometric programming. The theory and applica- 
tions of calculus of variations, optimal control theory, and optimality criteria methods 
are discussed briefly in Chapter 12. Chapter 13 presents several modern methods of 
optimization including genetic algorithms, simulated annealing, particle swarm opti- 
mization, ant colony optimization, neural-network-based methods, and fuzzy system 
optimization. Several of the approximation techniques used to speed up the conver- 
gence of practical mechanical and structural optimization problems, as well as parallel 
computation and multiobjective optimization techniques are outlined in Chapter 14. 
Appendix A presents the definitions and properties of convex and concave functions. 
A brief discussion of the computational aspects and some of the commercial optimiza- 
tion programs is given in Appendix B. Finally, Appendix C presents a brief introduction 
to Matlab, optimization toolbox, and use of Matlab programs for the solution of opti- 
mization problems. 
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1 


Introduction to Optimization 


1.1 INTRODUCTION 

Optimization is the act of obtaining the best result under given circumstances. In design, 
construction, and maintenance of any engineering system, engineers have to take many 
technological and managerial decisions at several stages. The ultimate goal of all such 
decisions is either to minimize the effort required or to maximize the desired benefit. 
Since the effort required or the benefit desired in any practical situation can be expressed 
as a function of certain decision variables, optimization can be defined as the process 
of finding the conditions that give the maximum or minimum value of a function. It can 
be seen from Fig. 1.1 that if a point x* corresponds to the minimum value of function 
f(x), the same point also corresponds to the maximum value of the negative of the 
function, —f(x). Thus without loss of generality, optimization can be taken to mean 
minimization since the maximum of a function can be found by seeking the minimum 
of the negative of the same function. 

In addition, the following operations on the objective function will not change the 
optimum solution x* (see Fig. 1.2): 

1. Multiplication (or division) of f(x) by a positive constant c. 

2 . Addition (or subtraction) of a positive constant c to (or from) f(x). 

There is no single method available for solving all optimization problems effi- 
ciently. Hence a number of optimization methods have been developed for solving 
different types of optimization problems. The optimum seeking methods are also known 
as mathematical programming techniques and are generally studied as a part of oper- 
ations research. Operations research is a branch of mathematics concerned with the 
application of scientific methods and techniques to decision making problems and with 
establishing the best or optimal solutions. The beginnings of the subject of operations 
research can be traced to the early period of World War H. During the war, the British 
military faced the problem of allocating very scarce and limited resources (such as 
fighter airplanes, radars, and submarines) to several activities (deployment to numer- 
ous targets and destinations). Because there were no systematic methods available to 
solve resource allocation problems, the military called upon a team of mathematicians 
to develop methods for solving the problem in a scientific manner. The methods devel- 
oped by the team were instrumental in the winning of the Air Battle by Britain. These 
methods, such as linear programming, which were developed as a result of research 
on (military) operations, subsequently became known as the methods of operations 
research. 
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Figure 1.1 Minimum of /( x) is same as maximum of —f(x). 


c/W 



Figure 1.2 Optimum solution of cf(x) or c + f(x) same as that of f(x). 


Table 1.1 lists various mathematical programming techniques together with other 
well-defined areas of operations research. The classification given in Table 1.1 is not 
unique; it is given mainly for convenience. 

Mathematical programming techniques are useful in finding the minimum of a 
function of several variables under a prescribed set of constraints. Stochastic process 
techniques can be used to analyze problems described by a set of random variables 
having known probability distributions. Statistical methods enable one to analyze the 
experimental data and build empirical models to obtain the most accurate represen- 
tation of the physical situation. This book deals with the theory and application of 
mathematical programming techniques suitable for the solution of engineering design 
problems. 
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Table 1.1 Methods of Operations Research 


Mathematical programming or 
optimization techniques 

Stochastic process 
techniques 

Statistical methods 

Calculus methods 

Statistical decision theory 

Regression analysis 

Calculus of variations 

Markov processes 

Cluster analysis, pattern 

Nonlinear programming 

Queueing theory 

recognition 

Geometric programming 

Renewal theory 

Design of experiments 

Quadratic programming 

Simulation methods 

Discriminate analysis 

Linear programming 
Dynamic programming 
Integer programming 
Stochastic programming 
Separable programming 
Multiobjective programming 
Network methods: CPM and PERT 
Game theory 

Reliability theory 

(factor analysis) 

Modern or nontraditional optimization techniques 

Genetic algorithms 
Simulated annealing 
Ant colony optimization 
Particle swarm optimization 
Neural networks 
Fuzzy optimization 



1.2 HISTORICAL DEVELOPMENT 

The existence of optimization methods can be traced to the days of Newton, Lagrange, 
and Cauchy. The development of differential calculus methods of optimization was 
possible because of the contributions of Newton and Leibnitz to calculus. The founda- 
tions of calculus of variations, which deals with the minimization of functionals, were 
laid by Bernoulli, Euler, Lagrange, and Weirstrass. The method of optimization for con- 
strained problems, which involves the addition of unknown multipliers, became known 
by the name of its inventor, Lagrange. Cauchy made the first application of the steep- 
est descent method to solve unconstrained minimization problems. Despite these early 
contributions, very little progress was made until the middle of the twentieth century, 
when high-speed digital computers made implementation of the optimization proce- 
dures possible and stimulated further research on new methods. Spectacular advances 
followed, producing a massive literature on optimization techniques. This advance- 
ment also resulted in the emergence of several well-defined new areas in optimization 
theory. 

It is interesting to note that the major developments in the area of numerical meth- 
ods of unconstrained optimization have been made in the United Kingdom only in the 
1960s. The development of the simplex method by Dantzig in 1947 for linear program- 
ming problems and the annunciation of the principle of optimality in 1957 by Bellman 
for dynamic programming problems paved the way for development of the methods 
of constrained optimization. Work by Kuhn and Tucker in 1951 on the necessary and 
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sufficiency conditions for the optimal solution of programming problems laid the foun- 
dations for a great deal of later research in nonlinear programming. The contributions 
of Zoutendijk and Rosen to nonlinear programming during the early 1960s have been 
significant. Although no single technique has been found to be universally applica- 
ble for nonlinear programming problems, work of Carroll and Fiacco and McCormick 
allowed many difficult problems to be solved by using the well-known techniques of 
unconstrained optimization. Geometric programming was developed in the 1960s by 
Duffin, Zener, and Peterson. Gomory did pioneering work in integer programming, 
one of the most exciting and rapidly developing areas of optimization. The reason for 
this is that most real-world applications fall under this category of problems. Dantzig 
and Charnes and Cooper developed stochastic programming techniques and solved 
problems by assuming design parameters to be independent and normally distributed. 

The desire to optimize more than one objective or goal while satisfying the phys- 
ical limitations led to the development of multiobjective programming methods. Goal 
programming is a well-known technique for solving specific types of multiobjective 
optimization problems. The goal programming was originally proposed for linear prob- 
lems by Charnes and Cooper in 1961. The foundations of game theory were laid by 
von Neumann in 1928 and since then the technique has been applied to solve several 
mathematical economics and military problems. Only during the last few years has 
game theory been applied to solve engineering design problems. 

Modern Methods of Optimization. The modern optimization methods, also some- 
times called nontraditional optimization methods, have emerged as powerful and pop- 
ular methods for solving complex engineering optimization problems in recent years. 
These methods include genetic algorithms, simulated annealing, particle swarm opti- 
mization, ant colony optimization, neural network-based optimization, and fuzzy opti- 
mization. The genetic algorithms are computerized search and optimization algorithms 
based on the mechanics of natural genetics and natural selection. The genetic algorithms 
were originally proposed by John Flolland in 1975. The simulated annealing method 
is based on the mechanics of the cooling process of molten metals through annealing. 
The method was originally developed by Kirkpatrick, Gelatt, and Vecchi. 

The particle swarm optimization algorithm mimics the behavior of social organisms 
such as a colony or swarm of insects (for example, ants, termites, bees, and wasps), a 
flock of birds, and a school of fish. The algorithm was originally proposed by Kennedy 
and Eberhart in 1995. The ant colony optimization is based on the cooperative behavior 
of ant colonies, which are able to find the shortest path from their nest to a food 
source. The method was first developed by Marco Dorigo in 1992. The neural network 
methods are based on the immense computational power of the nervous system to solve 
perceptional problems in the presence of massive amount of sensory data through its 
parallel processing capability. The method was originally used for optimization by 
Flopfield and Tank in 1985. The fuzzy optimization methods were developed to solve 
optimization problems involving design data, objective function, and constraints stated 
in imprecise form involving vague and linguistic descriptions. The fuzzy approaches 
for single and multiobjective optimization in engineering design were first presented 
by Rao in 1986. 
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1.3 ENGINEERING APPLICATIONS OF OPTIMIZATION 

Optimization, in its broadest sense, can be applied to solve any engineering problem. 
Some typical applications from different engineering disciplines indicate the wide scope 
of the subject: 

1. Design of aircraft and aerospace structures for minimum weight 

2. Finding the optimal trajectories of space vehicles 

3. Design of civil engineering structures such as frames, foundations, bridges, 
towers, chimneys, and dams for minimum cost 

4. Minimum-weight design of structures for earthquake, wind, and other types of 
random loading 

5. Design of water resources systems for maximum benefit 

6 . Optimal plastic design of structures 

7. Optimum design of linkages, cams, gears, machine tools, and other mechanical 
components 

8 . Selection of machining conditions in metal-cutting processes for minimum pro- 
duction cost 

9. Design of material handling equipment, such as conveyors, trucks, and cranes, 
for minimum cost 

10. Design of pumps, turbines, and heat transfer equipment for maximum efficiency 

11. Optimum design of electrical machinery such as motors, generators, and trans- 
formers 

12. Optimum design of electrical networks 

13. Shortest route taken by a salesperson visiting various cities during one tour 

14. Optimal production planning, controlling, and scheduling 

15. Analysis of statistical data and building empirical models from experimental 
results to obtain the most accurate representation of the physical phenomenon 

16. Optimum design of chemical processing equipment and plants 

17. Design of optimum pipeline networks for process industries 

18. Selection of a site for an industry 

19. Planning of maintenance and replacement of equipment to reduce operating 
costs 

20. Inventory control 

21. Allocation of resources or services among several activities to maximize the 
benefit 

22. Controlling the waiting and idle times and queueing in production lines to reduce 
the costs 

23. Planning the best strategy to obtain maximum profit in the presence of a com- 
petitor 

24. Optimum design of control systems 


6 Introduction to Optimization 

1.4 STATEMENT OF AN OPTIMIZATION PROBLEM 


An optimization or a mathematical programming problem can be stated as follows. 


Find X = 


xi 

*2 


which minimizes /(X) 


x n 


subject to the constraints 


s,(X) < o, 
Z;(X) = 0, 


j — 1,2, ... ,m 
j = l,2,...,p 


( 1 . 1 ) 


where X is an //-dimensional vector called the design vector , /(X) is termed the objec- 
tive function, and g; (X) and l t (X ) are known as inequality and equality constraints, 
respectively. The number of variables n and the number of constraints m and/or p 
need not be related in any way. The problem stated in Eq. (1.1) is called a constrained 
optimization problem.' Some optimization problems do not involve any constraints and 
can be stated as 


Find X = 


x\ 

X2 


which minimizes /(X) 


(1.2) 


Such problems are called unconstrained optimization problems. 


1.4.1 Design Vector 

Any engineering system or component is defined by a set of quantities some of which 
are viewed as variables during the design process. In general, certain quantities are 
usually fixed at the outset and these are called preassigned parameters . All the other 
quantities are treated as variables in the design process and are called design or decision 
variables Xj, i — 1,2 The design variables are collectively represented as a 
design vector X = {jci, * 2, . . . , x n } T . As an example, consider the design of the gear 
pair shown in Fig. 1.3, characterized by its face width b, number of teeth 7) and 
T 2 , center distance d, pressure angle 1 //, tooth profile, and material. If center distance 
d, pressure angle f , tooth profile, and material of the gears are fixed in advance, 
these quantities can be called preassigned parameters . The remaining quantities can be 
collectively represented by a design vector X = {xi , xi, V 3} 1 = \b. 7) . To } 1 . If there are 
no restrictions on the choice of b, T\, and T 2 , any set of three numbers will constitute a 
design for the gear pair. If an //-dimensional Cartesian space with each coordinate axis 
representing a design variable jq (i — 1 . 2, ... , n) is considered, the space is called 


+ In the mathematical programming literature, the equality constraints /y (X ) = 0, j = 1, 2, . . . , p are often 
neglected, for simplicity, in the statement of a constrained optimization problem, although several methods 
are available for handling problems with equality constraints. 
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the design variable space or simply design space. Each point in the n -dimensional 
design space is called a design point and represents either a possible or an impossible 
solution to the design problem. In the case of the design of a gear pair, the design 
point {1.0, 20, 40} T , for example, represents a possible solution, whereas the design 
point {1.0, — 20, 40.5 } t represents an impossible solution since it is not possible to 
have either a negative value or a fractional value for the number of teeth. 

1.4.2 Design Constraints 

In many practical problems, the design variables cannot be chosen arbitrarily; rather, 
they have to satisfy certain specified functional and other requirements. The restrictions 
that must be satisfied to produce an acceptable design are collectively called design 
constraints . Constraints that represent limitations on the behavior or performance of 
the system are termed behavior or functional constraints. Constraints that represent 
physical limitations on design variables, such as availability, fabricability, and trans- 
portability, are known as geometric or side constraints . For example, for the gear pair 
shown in Fig. 1.3, the face width b cannot be taken smaller than a certain value, due 
to strength requirements. Similarly, the ratio of the numbers of teeth, T 1 /T 2 , is dictated 
by the speeds of the input and output shafts, N\ and (Vo. Since these constraints depend 
on the performance of the gear pair, they are called behavior constraints. The values 
of T\ and T 2 cannot be any real numbers but can only be integers. Further, there can 
be upper and lower bounds on T\ and 7? due to manufacturing limitations. Since these 
constraints depend on the physical limitations, they are called side constraints. 
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1.4.3 Constraint Surface 

For illustration, consider an optimization problem with only inequality constraints 
S,-(X) < 0. The set of values of X that satisfy the equation g y ( X ) = 0 forms a hyper- 
surface in the design space and is called a constraint surface. Note that this is an 
(, n — l)-dimensional subspace, where n is the number of design variables. The constraint 
surface divides the design space into two regions: one in which g ; (X) < 0 and the other 
in which g/(X) > 0. Thus the points lying on the hypersurface will satisfy the constraint 
g ; (X) critically, whereas the points lying in the region where g ; (X) > 0 are infeasible 
or unacceptable, and the points lying in the region where gj(X) < 0 are feasible or 
acceptable. The collection of all the constraint surfaces g ; (X) =0, j — 1,2, ... ,m, 
which separates the acceptable region is called the composite constraint surface. 

Figure 1.4 shows a hypothetical two-dimensional design space where the infeasible 
region is indicated by hatched lines. A design point that lies on one or more than one 
constraint surface is called a bound point, and the associated constraint is called an 
active constraint. Design points that do not lie on any constraint surface are known as 
free points. Depending on whether a particular design point belongs to the acceptable 
or unacceptable region, it can be identified as one of the following four types: 

1. Free and acceptable point 

2. Free and unacceptable point 

3. Bound and acceptable point 

4. Bound and unacceptable point 

All four types of points are shown in Fig. 1 .4. 



Figure 1.4 Constraint surfaces in a hypothetical two-dimensional design space. 
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1.4.4 Objective Function 

The conventional design procedures aim at finding an acceptable or adequate design 
that merely satisfies the functional and other requirements of the problem. In general, 
there will be more than one acceptable design, and the purpose of optimization is 
to choose the best one of the many acceptable designs available. Thus a criterion 
has to be chosen for comparing the different alternative acceptable designs and for 
selecting the best one. The criterion with respect to which the design is optimized, 
when expressed as a function of the design variables, is known as the criterion or merit 
or objective function. The choice of objective function is governed by the nature of 
problem. The objective function for minimization is generally taken as weight in aircraft 
and aerospace structural design problems. In civil engineering structural designs, the 
objective is usually taken as the minimization of cost. The maximization of mechanical 
efficiency is the obvious choice of an objective in mechanical engineering systems 
design. Thus the choice of the objective function appears to be straightforward in most 
design problems. However, there may be cases where the optimization with respect 
to a particular criterion may lead to results that may not be satisfactory with respect 
to another criterion. For example, in mechanical design, a gearbox transmitting the 
maximum power may not have the minimum weight. Similarly, in structural design, 
the minimum weight design may not correspond to minimum stress design, and the 
minimum stress design, again, may not correspond to maximum frequency design. Thus 
the selection of the objective function can be one of the most important decisions in 
the whole optimum design process. 

In some situations, there may be more than one criterion to be satisfied simul- 
taneously. For example, a gear pair may have to be designed for minimum weight 
and maximum efficiency while transmitting a specified horsepower. An optimization 
problem involving multiple objective functions is known as a multiobjective program- 
ming problem. With multiple objectives there arises a possibility of conflict, and one 
simple way to handle the problem is to construct an overall objective function as a 
linear combination of the conflicting multiple objective functions. Thus if /i(X ) and 
/ 2 (X) denote two objective functions, construct a new (overall) objective function for 
optimization as 


/(X)=o 1 /i(X)+a 2 / 2 (X) (1.3) 

where a\ and «2 are constants whose values indicate the relative importance of one 
objective function relative to the other. 

1.4.5 Objective Function Surfaces 

The locus of all points satisfying /(X) = C = constant forms a hypersurface in the 
design space, and each value of C corresponds to a different member of a family of 
surfaces. These surfaces, called objective function surfaces , are shown in a hypothetical 
two-dimensional design space in Fig. 1.5. 

Once the objective function surfaces are drawn along with the constraint surfaces, 
the optimum point can be determined without much difficulty. But the main problem 
is that as the number of design variables exceeds two or three, the constraint and 
objective function surfaces become complex even for visualization and the problem 
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*2 



has to be solved purely as a mathematical problem. The following example illustrates 
the graphical optimization procedure. 

Example 1.1 Design a uniform column of tubular section, with hinge joints at both 
ends, (Fig. 1.6) to carry a compressive load P =2500kg f for minimum cost. The 
column is made up of a material that has a yield stress (cr y ) of 500kg f /cm 2 , modulus 
of elasticity ( E ) of 0.85 x 10 6 kg f /cm 2 , and weight density (p) of 0.0025 kg f /cm 3 . 
The length of the column is 250 cm. The stress induced in the column should be less 
than the buckling stress as well as the yield stress. The mean diameter of the column 
is restricted to lie between 2 and 14 cm, and columns with thicknesses outside the 
range 0.2 to 0.8 cm are not available in the market. The cost of the column includes 
material and construction costs and can be taken as 5 W + 2d, where W is the weight 
in kilograms force and cl is the mean diameter of the column in centimeters. 


SOLUTION The design variables are the mean diameter (d) and tube thickness (f): 



(Ei) 


The objective function to be minimized is given by 


(E 2 ) 


/(X) = 5VT + 2d — 5pln dt + 2d — 9.82xi.x' 2 + 2*i 
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The behavior constraints can be expressed as 

stress induced < yield stress 
stress induced < buckling stress 


The induced stress is given by 

P 2500 

induced stress = er ( - = = 

n dt nx\X 2 

The buckling stress for a pin-connected column is given by 

Euler buckling load tz 2 EI 1 

buckling stress = a /, — = — r 

cross-sectional area l- n dt 

where 


I = second moment of area of the cross section of the column 


— TT (do + dj)(d 0 + di){d a — dj) — — [(d + t) + (d — t) 2 ] 
64 64 

x [(d + t) + (d- t)][(d + r)~ (d - ?)] 

= ^ dt(d 2 + t 2 ) = ^-XiX 2 (x 2 + xf) 

O O 


(E 3 ) 


(E 4 ) 


(E 5 ) 
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Thus the behavior constraints can be restated as 
„ 2500 

gi(X) = 500 < 0 

7r.r1.r2 


„ 2500 7t 2 (0.85 x 10 6 )(x? + x?) 

£,(X) = —2 IL < 0 

’ — - 8(250) 2 


7r.riX2 

The side constraints are given by 

2 < d < 14 
0.2 < t < 0.8 

which can be expressed in standard form as 


g 3 (X) = x 1 + 2.0 < 0 
g 4 (X) = x\ - 14.0 < 0 

g 5 (X) = -*2+0.2 < 0 
g 6 (X) = r 2 - 0.8 < 0 


(E 6 ) 

(E 7 ) 


(Eg) 

(Eg) 

(E10) 

(En) 


Since there are only two design variables, the problem can be solved graphically as 
shown below. 

First, the constraint surfaces are to be plotted in a two-dimensional design space 
where the two axes represent the two design variables xi and * 2 - To plot the first 
constraint surface, we have 


gi(X) = 


2500 

TZX\*2 


- 500 < 0 


that is, 


x\*2 > 1.593 


Thus the curve x 1 v '2 = 1.593 represents the constraint surface gi(X) = 0. This curve 
can be plotted by finding several points on the curve. The points on the curve can be 
found by giving a series of values to x\ and finding the corresponding values of *2 
that satisfy the relation * 1 X 2 = 1.593: 


2.0 4.0 6.0 8.0 10.0 12.0 14.0 

0.7965 0.3983 0.2655 0.1990 0.1593 0.1328 0.1140 


These points are plotted and a curve Pi Q \ passing through all these points is drawn as 
shown in Fig. 1.7, and the infeasible region, represented by gi(X) > 0 or X 1 X 2 < 1.593, 
is shown by hatched lines. 1 Similarly, the second constraint g 2 (X) <0 can be expressed 
as xi* 2 (xf +* 2 ) > 47.3 and the points lying on the constraint surface gnjX ) = 0 can 
be obtained as follows for xi* 2 (xf + x 2 ) = 47.3: 


^The infeasible region can be identified by testing whether the origin lies in the feasible or infeasible 
region. 
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t(x 2 ) 



JC1 2 4 6 8 10 12 14 

x 2 2.41 0.716 0.219 0.0926 0.0473 0.0274 0.0172 

These points are plotted as curve P 2 Q 2 , the feasible region is identified, and the infea- 
sible region is shown by hatched lines as in Fig. 1.7. The plotting of side constraints 
is very simple since they represent straight lines. After plotting all the six constraints, 
the feasible region can be seen to be given by the bounded area ABCDEA. 
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Next, the contours of the objective function are to be plotted before finding the 
optimum point. For this, we plot the curves given by 

/ (X) = 9 . 82 x 1*2 + 2 *i — c — constant 

for a series of values of c. By giving different values to c, the contours of / can be 
plotted with the help of the following points. 

For 9.82x1*2 + 2*i = 50.0: 


*2 

0.1 

0.2 

0.3 

0.4 

0.5 

0.6 

0.7 

0.8 

*1 

16.77 

12.62 

10.10 

8.44 

7.24 

6.33 

5.64 

5.07 

For 9.82*1*2 + 2*i = 

40.0: 






*2 

0.1 

0.2 

0.3 

0.4 

0.5 

0.6 

0.7 

0.8 

*1 

13.40 

10.10 

8.08 

6.75 

5.79 

5.06 

4.51 

4.05 

For 9.82*1*2 + 2*i = 

31.58 (passing through the corner point C): 


*2 

0.1 

0.2 

0.3 

0.4 

0.5 

0.6 

0.7 

0.8 

*1 

10.57 

7.96 

6.38 

5.33 

4.57 

4.00 

3.56 

3.20 

For 9.82*1*2 + 2*i = 

26.53 (passing through the corner point B): 


*2 

0.1 

0.2 

0.3 

0.4 

0.5 

0.6 

0.7 

0.8 

*1 

8.88 

6.69 

5.36 

4.48 

3.84 

3.36 

2.99 

2.69 

For 9.82*1*2 + 2*i = 

20.0: 






*2 

0.1 

0.2 

0.3 

0.4 

0.5 

0.6 

0.7 

0.8 

*1 

6.70 

5.05 

4.04 

3.38 

2.90 

2.53 

2.26 

2.02 


These contours are shown in Fig. 1.7 and it can be seen that the objective function 
cannot be reduced below a value of 26.53 (corresponding to point B ) without violating 
some of the constraints. Thus the optimum solution is given by point B with d* = 
x* = 5.44 cm and t* = x| = 0.293 cm with f m j n = 26.53. 

1.5 CLASSIFICATION OF OPTIMIZATION PROBLEMS 

Optimization problems can be classified in several ways, as described below. 

1.5.1 Classification Based on the Existence of Constraints 

As indicated earlier, any optimization problem can be classified as constrained or uncon- 
strained, depending on whether constraints exist in the problem. 
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1.5.2 Classification Based on the Nature of the Design Variables 

Based on the nature of design variables encountered, optimization problems can be 
classified into two broad categories. In the first category, the problem is to find values 
to a set of design parameters that make some prescribed function of these parameters 
minimum subject to certain constraints. For example, the problem of minimum- weight 
design of a prismatic beam shown in Fig. 1.8a subject to a limitation on the maximum 
deflection can be stated as follows: 


Find X 



which minimizes 


/(X) = plbd 


(1.4) 


subject to the constraints 


<5ti P (X) < 5 


max 


b > 0 


d > 0 


where p is the density and <5 t ; p is the tip deflection of the beam. Such problems are 
called parameter or static optimization problems. In the second category of problems, 
the objective is to find a set of design parameters, which are all continuous functions 
of some other parameter, that minimizes an objective function subject to a set of 
constraints. If the cross-sectional dimensions of the rectangular beam are allowed to 
vary along its length as shown in Fig. 1 .8 b, the optimization problem can be stated as 


Find X (f) = 


\b(t) 

\d(t) 


which minimizes 


/[X (t)] — p f b(t) d(t)dt 
Jo 

subject to the constraints 

<$tip[X(0] < <5max, 0 <t<l 

b(t) >0, 0 St <1 

d(t ) >0, 0 <t <1 


(1.5) 



Figure 1.8 Cantilever beam under concentrated load. 
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Here the design variables are functions of the length parameter t. This type of problem, 
where each design variable is a function of one or more parameters, is known as a 
trajectory or dynamic optimization problem [1.55]. 


1.5.3 C lassification Based on the Physical Structure of the Problem 

Depending on the physical structure of the problem, optimization problems can be 
classified as optimal control and nonoptimal control problems. 

Optimal Control Problem. An optimal control (OC) problem is a mathematical pro- 
gramming problem involving a number of stages, where each stage evolves from the 
preceding stage in a prescribed manner. It is usually described by two types of vari- 
ables: the control (design) and the state variables. The control variables define the 
system and govern the evolution of the system from one stage to the next, and the state 
variables describe the behavior or status of the system in any stage. The problem is 
to find a set of control or design variables such that the total objective function (also 
known as the performance index , PI) over all the stages is minimized subject to a 
set of constraints on the control and state variables. An OC problem can be stated as 
follows [1.55]: 


i 

Find X which minimizes /(X) = ^ j\ {xj , y,) (1.6) 

i=i 

subject to the constraints 

qi(xi,yi ) + y, = y/+i, i = 1,2,...,/ 

9j(xj)<0, j — 1,2,...,/ 

My*) < 0 , k — 1 , 2 ,...,/ 

where Xj is the ith control variable, y,- the ith state variable, and f the contribution 
of the ith stage to the total objective function; g ; . fp, and cp are functions of xj, y, t, 
and Xj and y,-, respectively, and / is the total number of stages. The control and state 
variables x, and y,- can be vectors in some cases. The following example serves to 
illustrate the nature of an optimal control problem. 

Example 1.2 A rocket is designed to travel a distance of 1 2s in a vertically upward 
direction [1.39]. The thrust of the rocket can be changed only at the discrete points 
located at distances of 0, 5, 2s, 3s, ... , 1 2s. If the maximum thrust that can be devel- 
oped at point i either in the positive or negative direction is restricted to a value of 
Fj, formulate the problem of minimizing the total time of travel under the following 
assumptions: 

1. The rocket travels against the gravitational force. 

2. The mass of the rocket reduces in proportion to the distance traveled. 

3. The air resistance is proportional to the velocity of the rocket. 
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Figure 1.9 Control points in the path of the rocket. 


SOLUTION Let points (or control points) on the path at which the thrusts of the 
rocket are changed be numbered as 1, 2, 3, . . . , 13 (Fig. 1.9). Denoting x, as the thrust, 
Vj the velocity, a, the acceleration, and m, the mass of the rocket at point i, Newton’s 
second law of motion can be applied as 

net force on the rocket = mass x acceleration 
This can be written as 


thrust — gravitational force — air resistance = mass x acceleration 
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or 


Xj — m,g — k\ Vi = m ; a, (E j ) 

where the mass m, can be expressed as 

m, = m-i-i - k 2 s (E 2 ) 

and k\ and k 2 are constants. Equation (Ej ) can be used to express the acceleration, a,, 
as 


Clj 


A k\Vi 

nij 8 nij 


(E 3 ) 


If ti denotes the time taken by the rocket to travel from point i to point i + 1 , the 
distance traveled between the points i and i + 1 can be expressed as 


s = Viti + \a t t} 


or 



+ ti Vj — s — 0 


from which /,■ can be determined as 


~”' ± /”‘+ 2s (^ s 

Xi_ _ _ hvi_ 

Mi Mi 



(e 4 ) 


(Es) 


Of the two values given by Eq. (E5), the positive value has to be chosen for The 
velocity of the rocket at point i + 1, v- l+ \, can be expressed in terms of v t as (by 
assuming the acceleration between points i and i + 1 to be constant for simplicity) 


Vi + 1 = Vj + diti 


(Eg) 


The substitution of Eqs. (E3) and (E5) into Eq. (Eg) leads to 


Vi + 1 = 



/ Xi k\v t 

g 

\Mi Mi 


(e 7 ) 


From an analysis of the problem, the control variables can be identified as the thrusts, 
Xi, and the state variables as the velocities, u,. Since the rocket starts at point 1 and 
stops at point 13 , 


v\ — JJ13 = 0 


(Eg) 
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Thus the problem can be stated as an OC problem as 


Find X = 


*1 

*2 

*12 


which minimizes 


12 12 


/<x) = £" = £ 


i = 1 i = l 


, / 2 , o ( X > *1 v i 

-Vi + vf + 2s[ g 

V \ m i m i 

Xi k\ Vi 


subject to 


m,+ 1 = nij — k 2 S, i — 1 , 2, . . . , 12 


Vi+i = Jvf + 2s l — - g - ), i = 1,2, .... 12 


l*il<F ; , 1 = 1,2,..., 12 

U1 = U13 = 0 


1.5.4 Classification Based on the Nature of the Equations Involved 

Another important classification of optimization problems is based on the nature of 
expressions for the objective function and the constraints. According to this classi- 
fication, optimization problems can be classified as linear, nonlinear, geometric, and 
quadratic programming problems. This classification is extremely useful from the com- 
putational point of view since there are many special methods available for the efficient 
solution of a particular class of problems. Thus the first task of a designer would be 
to investigate the class of problem encountered. This will, in many cases, dictate the 
types of solution procedures to be adopted in solving the problem. 

Nonlinear Programming Problem. If any of the functions among the objective and 
constraint functions in Eq. (1.1) is nonlinear, the problem is called a nonlinear pro- 
gramming (NLP) problem . This is the most general programming problem and all other 
problems can be considered as special cases of the NLP problem. 

Example 1.3 The step-cone pulley shown in Fig. 1.10 is to be designed for trans- 
mitting a power of at least 0.75 hp. The speed of the input shaft is 350 rpm and the 
output speed requirements are 750, 450, 250, and 150 rpm for a fixed center distance 
of a between the input and output shafts. The tension on the tight side of the belt is to 
be kept more than twice that on the slack side. The thickness of the belt is t and the 
coefficient of friction between the belt and the pulleys is /x. The stress induced in the 
belt due to tension on the tight side is s. Formulate the problem of finding the width 
and diameters of the steps for minimum weight. 
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Figure 1.10 Step-cone pulley. 


SOLUTION The design vector can be taken as 


X = 


di 

di 

ds 

d 4 


w 


where di is the diameter of the ith step on the output pulley and w is the width of the 
belt and the steps. The objective function is the weight of the step-cone pulley system: 


/(X) — pw — (d\ + do + dz + c?4 + + do" + d^~ + d^~) 


n 

2 

( 750 \ 2 " 

.0 


i 450 \ 2_ 

P W ~A 

di 

1 + \ 350 / 

+ do 

! + 

v 350 / 


+ d 




(Ei) 


where p is the density of the pulleys and d ■ is the diameter of the / th step on the input 
pulley. 
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To have the belt equally tight on each pair of opposite steps, the total length of the 
belt must be kept constant for all the output speeds. This can be ensured by satisfying 
the following equality constraints: 


Ci - C 2 = 0 
Ci - C 3 = 0 
Cj - C 4 = 0 


where C, denotes length of the belt needed to obtain output speed iV, (i 
and is given by [1.116, 1.117]: 


Q 


ndi 

~2 




4 a 


2 a 


(E 2 ) 
(E 3 ) 
(E 4 ) 
1,2, 3, 4) 


where N is the speed of the input shaft and a is the center distance between the shafts. 
The ratio of tensions in the belt can be expressed as [1.116, 1.117] 


n 



= 


where 7]' and 77 are the tensions on the tight and slack sides of the 7th step, /i the 
coefficient of friction, and <9, the angle of lap of the belt over the rth pulley step. The 
angle of lap is given by 


9, — n — 2 sin 



2a 


and hence the constraint on the ratio of tensions becomes 


exp t fi 


tt — 2 sin 1 ] | — — 1 ) — 

N ) 2 a 


Ni 


di 


> 2 , 


i = 1,2, 3,4 


(E 5 ) 


The limitation on the maximum tension can be expressed as 


7y = stw, i — 1. 2, 3, 4 


(Eg) 


where s is the maximum allowable stress in the belt and t is the thickness of the belt. 
The constraint on the power transmitted can be stated as (using lbf for force and ft for 
linear dimensions) 


(7y - 7V W- (350) 

33,000 


> 0.75 


which can be rewritten, using T { ' — stw from Eq. (Eg), as 


stw ( 1 — exp 

350 
33,000 


li I it — 2 sin 


-l 


A T i 

N 


di_ 
2 a 


Ttd; 


> 0.75, 


i = 1,2, 3,4 


(E 7 ) 
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Finally, the lower bounds on the design variables can be taken as 

w > 0 (Eg) 

dj> 0, i = 1,2, 3,4 (E 9 ) 

As the objective function, (Ei), and most of the constraints, (Ei) to (Eg), are nonlinear 
functions of the design variables d\, d 2 , di, dn, and w, this problem is a nonlinear 
programming problem. 


Geometric Programming Problem. 

Definition A function /i(X) is called a posynomial if h can be expressed as the sum 
of power terms each of the form 

ail ail ain 

C; A j A 2 • • • A, ft 

where a and a,j are constants with c; > 0 and xj > 0. Thus a posynomial with N terms 
can be expressed as 

h (X ) = cue? 11 *! 12 ■ ■ -x a n ln + • • ■ + c N xf l xf 2 ■ ■■xf n (1.7) 


A geometric programming (GMP) problem is one in which the objective function 
and constraints are expressed as posynomials in X. Thus GMP problem can be posed 
as follows [1.59]: 

Find X which minimizes 


subject to 


N 0 

/(X) = X> 


1=1 



Cj > 0 , xj > 0 


( 1 . 8 ) 


Nk 

gk (X ) = ^a ik 
i=i 



> 0 , 


aik >0, xj > 0, k = 1 , 2, . . . , m 


where No and Nk denote the number of posynomial terms in the objective and kt h 
constraint function, respectively. 


Example 1.4 Four identical helical springs are used to support a milling machine 
weighing 50001b. Formulate the problem of finding the wire diameter (c/), coil diameter 
(D), and the number of turns (N) of each spring (Fig. 1.11) for minimum weight by 
limiting the deflection to 0.1 in. and the shear stress to 10,000 psi in the spring. In 
addition, the natural frequency of vibration of the spring is to be greater than 100 Hz. 
The stiffness of the spring ( k ), the shear stress in the spring (r), and the natural 
frequency of vibration of the spring (/„) are given by 


k = 


x = K, 


d 4 G 
8 D 3 N 
8 FD 


rid 3 


I [kg 1 / d A G 




w 2V 8D 3 N p{nd 2 /4)n DN 


\fGg d 
2^/2pnD 2 N 


1.5 Classification of Optimization Problems 23 


F 



where G is the shear modulus, F the compressive load on the spring, w the weight of 
the spring, p the weight density of the spring, and K s the shear stress correction factor. 
Assume that the material is spring steel with G — 12 x 10 6 psi and p — 0.3 lb/in 3 , and 
the shear stress correction factor is K s ~ 1.05. 


SOLUTION The design vector is given by 



X\ 


d 

X = 

X2 

■ = ■ 

D 


x 3 


N 


and the objective function by 

/(X) = weight 

The constraints can be expressed as 


—^—jzDNp 


that is, 


F 8 FD 3 N 

deflection = — = — <0.1 

k d 4 G ~ 


v cl 4 G 

gl(X) = T > 1 

80 FD 3 N 

8 FD 

s — 3 < 10,000 


(Hi) 


(E 2 ) 


shear stress = K 
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that is, 


that is, 


52 (X) = 


1250 nd 2 
K S FD 


natural frequency 


V^5 


2 s /2pn D 2 N 


> 100 


53 (X) = 


s/Ggd 

200V2 pnD 2 N 


(E 3 ) 


(E 4 ) 


Since the equality sign is not included (along with the inequality symbol, >) in the 
constraints of Eqs. (E2) to (E4), the design variables are to be restricted to positive 
values as 


d> 0, D>0, N> 0 


(E 5 ) 


By substituting the known data, F = weight of the milling machine/4 = 12501b, p = 
0.3 lb/in 3 , G — 12 x 10 6 psi, and K s — 1.05, Eqs. (E 1 ) to (E4) become 


/(X) 
5t (X) 

52 (X) 

53 (X) 


^n 2 (0.3)d 2 DN = 0.7402x^X2X 3 


d 4 ( 12 x 10 6 ) 

800250)^77 


= 120x/x 2 3 x 3 1 > 1 


1250 nd 2 
1.05(1250)D 


= 2.992x 3 x 0 1 > 1 


\fGg d 

200 s/2pn D 2 N 


139.8388xix 2 2 x 3 1 > 1 


(E 6 ) 

(Ey) 

(Eg) 

(Eg) 


It can be seen that the objective function, /(X), and the constraint functions, 51 (X) to 
g 3 (X), are posynomials and hence the problem is a GMP problem. 


Quadratic Programming Problem. A quadratic programming problem is a nonlinear 
programming problem with a quadratic objective function and linear constraints. It is 
usually formulated as follows: 


subject to 


F (X) = c + ^ qiXi + XI Q‘J x ‘ x j 
1=1 i = 1 j = 1 


'Y^a i jX i =b j , j = 1 , 2 ,..., 


m 


i = 1 


Xj > 0, i = 1, 2 , . . . , n 


(1.9) 


where c, qi, Qij, aij, and bj are constants. 
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Example 1.5 A manufacturing firm produces two products, A and B, using two limited 
resources. The maximum amounts of resources 1 and 2 available per day are 1000 and 
250 units, respectively. The production of 1 unit of product A requires 1 unit of resource 
1 and 0.2 unit of resource 2, and the production of 1 unit of product B requires 0.5 
unit of resource 1 and 0.5 unit of resource 2. The unit costs of resources 1 and 2 are 
given by the relations (0.375 — 0.00005i<i) and (0.75 — 0.0001»2). respectively, where 
Uj denotes the number of units of resource i used (i — 1, 2). The selling prices per unit 
of products A and B. p A and p B , are given by 

p A = 2.00 - 0.0005x A - 0.00015 jc b 

p B = 3.50 - 0.0002x A - 0.00 15^ B 

where x A and x B indicate, respectively, the number of units of products A and B sold. 
Formulate the problem of maximizing the profit assuming that the firm can sell all the 
units it manufactures. 

SOLUTION Let the design variables be the number of units of products A and B 
manufactured per day: 



The requirement of resource 1 per day is (x a +0.5xb) and that of resource 2 is 
(0.2 x a + 0.5x B ) and the constraints on the resources are 


x A + 0.5x b < 1000 

(Ei) 

0.2x a + 0.5-tfl < 250 

(E 2 ) 

The lower bounds on the design variables can be taken as 


x A >0 

(e 3 ) 

x B >0 

(e 4 ) 


The total cost of resources 1 and 2 per day is 

(x A + 0.5x b )[0.375 - 0.00005 (x A + 0.5x B )] 

+ (0.2x a + 0.5x s )[0.750 - 0.0001(0.2x A + 0.5 jc b )] 

and the return per day from the sale of products A and B is 

x A (2.00 - 0.0005.* a - 0.00015jc b ) + x B (3.50 - 0.0002x A - 0.0015x B ) 

The total profit is given by the total return minus the total cost. Since the objective 
function to be minimized is the negative of the profit per day, /(X) is given by 

/(X) = (x A + 0.5x B ) [0.375 - 0.00005 (x A + 0.5 jc b )] 

+ (0.2jc a + 0.5x b )[ 0.750 - 0.0001 (0.2x A + 0.5x B )] 

- x A (2.00 - 0.0005jc a - 0.0001 5x B ) 

— xb(3.50 — 0.0002x a — 0.0015xb) (E 5 ) 
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As the objective function [Eq. (E5)] is a quadratic and the constraints [Eqs. (Ej) to 
(E4)] are linear, the problem is a quadratic programming problem. 


Linear Programming Problem. If the objective function and all the constraints in 
Eq. (1.1) are linear functions of the design variables, the mathematical programming 
problem is called a linear programming (LP) problem. A linear programming problem 
is often stated in the following standard form: 


Find X = 


n 

which minimizes /(X) = "Y, qxj 

i = 1 


subject to the constraints 

n 

Y. atjXi — bj, j = 1 , 2, . . . , m 

i = 1 


Xj > 0, i — \. 2. ... ,n 


where c, , a, 7, and bj are constants. 


( 1 . 10 ) 


Example 1.6 A scaffolding system consists of three beams and six ropes as shown 
in Fig. 1.12. Each of the top ropes A and B can carry a load of W \ , each of the 
middle ropes C and D can carry a load of W 2 , and each of the bottom ropes E and 
F can carry a load of VE3. If the loads acting on beams 1, 2, and 3 are x\, xo, and X3, 
respectively, as shown in Fig. 1.12, formulate the problem of finding the maximum 
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load (x\ + X 2 + x 3 ) that can be supported by the system. Assume that the weights of 
the beams 1, 2, and 3 are w i, W 2 , and w 3 , respectively, and the weights of the ropes 
are negligible. 

SOLUTION Assuming that the weights of the beams act through their respective 
middle points, the equations of equilibrium for vertical forces and moments for each 
of the three beams can be written as 

For beam 3: 


T E + Tf — X 3 + w 3 
x 3 (3l) + w 3 (2l) - T f ( 41) = 0 


For beam 2: 


Tc + To — T e — x 3 + U >2 
x 2 (l) + w 2 (l) + T e (1) - T D (2l) = 0 


For beam 1 : 


Ta + T E — T e — Tp — T E — x i + w i 
Jd(3 l) + uq(§ /) - T b {91) + Tc(21) + T d ( 41) + T F (7l) - 0 
where 7} denotes the tension in rope i. The solution of these equations gives 
T f = \x 3 + jw 3 

T 1 | 1 

T e = 4X3 + 2 w 3 

rr 1 I 1 I 1 I 1 

Td = 2^2 + gX 3 + 2^2 + 4W3 

Tc = \xi + \x 3 + \w 2 + 5^3 

Tb = ^Xi + jX 2 + \x 3 + 3 W 1 + |)U 2 + |tf 3 

Ta — fxi + jX 2 + 5X3 + 2 + 5^3 

The optimization problem can be formulated by choosing the design vector as 


X = 


xi 

X2 

X3 


Since the objective is to maximize the total load 


/(X) = — (xi +x 2 +x 3 ) (Ei) 

The constraints on the forces in the ropes can be stated as 

T a < Wi (E 2 ) 

T b < Wi (E 3 ) 

Tc < W 2 (E 4 ) 
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T d < W 2 

(E 5 ) 

T e < W 3 

(e 6 ) 

£ 

VI 

(E 7 ) 


Finally, the nonnegativity requirement of the design variables can be expressed as 

x\ > 0 

X2 > 0 

*3 > o (Eg) 

Since all the equations of the problem (Ei) to (Eg), are linear functions of xi,X 2 , and 
X 3 , the problem is a linear programming problem. 


1.5.5 Classification Based on the Permissible Values of the Design Variables 

Depending on the values permitted for the design variables, optimization problems can 
be classified as integer and real-valued programming problems. 

Integer Programming Problem. If some or all of the design variables x\,X 2 , . . . , x n 
of an optimization problem are restricted to take on only integer (or discrete) values, 
the problem is called an integer programming problem. On the other hand, if all the 
design variables are permitted to take any real value, the optimization problem is 
called a real-valued programming problem. According to this definition, the problems 
considered in Examples 1 . 1 to 1 .6 are real-valued programming problems. 

Example 1.7 A cargo load is to be prepared from five types of articles. The weight 
u>j, volume Vj, and monetary value c, of different articles are given below. 


Article type 

Wi 

Vi 

Ci 

1 

4 

9 

5 

2 

8 

7 

6 

3 

2 

4 

3 

4 

5 

3 

2 

5 

3 

8 

8 


Find the number of articles x,- selected from the /th type (i — 1, 2, 3, 4, 5), so that the 
total monetary value of the cargo load is a maximum. The total weight and volume of 
the cargo cannot exceed the limits of 2000 and 2500 units, respectively. 

SOLUTION Let x,- be the number of articles of type i (i — 1 to 5) selected. Since 
it is not possible to load a fraction of an article, the variables x, can take only integer 
values. 

The objective function to be maximized is given by 


(Ei) 


/(X ) = 5 xj + 6x2 + 3.X3 + 2x4 + 8x5 
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and the constraints by 


4 xi + 8x2 + 2x3 + 5x4 + 3x5 < 2000 
9xi + 7x 2 + 4x3 + 3x4 + 8x5 < 2500 
x, > 0 and integral, i = 1 , 2, . . . , 5 


(E 2 ) 

(E 3 ) 

(E 4 ) 


Since x, are constrained to be integers, the problem is an integer programming 
problem. 


1.5.6 Classification Based on the Deterministic Nature of the Variables 


Based on the deterministic nature of the variables involved, optimization problems can 
be classified as deterministic and stochastic programming problems. 

Stochastic Programming Problem. A stochastic programming problem is an opti- 
mization problem in which some or all of the parameters (design variables and/or 
preassigned parameters) are probabilistic (nondeterministic or stochastic). According 
to this definition, the problems considered in Examples 1.1 to 1.7 are deterministic 
programming problems. 

Example 1.8 Formulate the problem of designing a minimum-cost rectangular under- 
reinforced concrete beam that can carry a bending moment M with a probability of at 
least 0.95. The costs of concrete, steel, and formwork are given by C c = $200/m 3 , C s — 
$5000/m 3 , and Cj — $40/m 2 of surface area. The bending moment M is a probabilistic 
quantity and varies between 1 x 10 5 and 2 x 10 5 N-m with a uniform probability. The 
strengths of concrete and steel are also uniformly distributed probabilistic quantities 
whose lower and upper limits are given by 


Assume that the area of the reinforcing steel and the cross-sectional dimensions of the 
beam are deterministic quantities. 

SOLUTION The breadth b in meters, the depth d in meters, and the area of reinforcing 
steel A s in square meters are taken as the design variables xi, x 2 , and X3, respectively 
(Fig. 1.13). The cost of the beam per meter length is given by 


f c = 25 and 35 MPa 
f s = 500 and 550 MPa 


/(X) = cost of steet + cost of concrete + cost of formwork 
— A S C S + ( bd — A s )C c + 2 (b + d)Cf 


(Ei) 


The resisting moment of the beam section is given by [1.119] 
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F jgure 1.13 Cross section of a reinforced concrete beam. 


and the constraint on the bending moment can be expressed as [1.120] 


P[M r - M > 0] = P 


A s f s \d- 0.59 | - M > 0 


> 0.95 


(E 2 ) 


where P[- ■ •] indicates the probability of occurrence of the event [■ ■ ■]. 

To ensure that the beam remains underreinforced/ the area of steel is bounded by 
the balanced steel area A (b> as 


A s < A 


C b ) 

s 


(E 3 ) 


where 


A (b) = (0.542)— bd 

fs 


600 

600 + f s 


Since the design variables cannot be negative, we have 


d > 0 


b > 0 


A s > 0 (E 4 ) 

Since the quantities M, f c , and f s are nondeterministic, the problem is a stochastic 
programming problem. 


1.5.7 Classification Based on the Separability of the Functions 

Optimization problems can be classified as separable and nonseparable programming 
problems based on the separability of the objective and constraint functions. 


Tf steel area is larger than the beam becomes overreinforced and failure occurs all of a sudden due 
to lack of concrete strength. If the beam is underreinforced, failure occurs due to lack of steel strength and 
hence it will be gradual. 
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Separable Programming Problem. 

Definition A function /(X) is said to be separable if it can be expressed as the sum 
of n single- variable functions, j\ (x\), fiixi ), . . . , f n (x „ ). that is, 

n 

= (lid 

;= t 

A separable programming problem is one in which the objective function and the 
constraints are separable and can be expressed in standard form as 

n 

Find X which minimizes /(X) = ^ /, (%, ) (1-12) 

i = 1 

subject to 

n 

gij(xi)<bj, j = 1,2, , m 

i = 1 


where bj is a constant. 

Example 1.9 A retail store stocks and sells three different models of TV sets. The 
store cannot afford to have an inventory worth more than $45,000 at any time. The 
TV sets are ordered in lots. It costs $aj for the store whenever a lot of TV model j 
is ordered. The cost of one TV set of model j is c ; . The demand rate of TV model 
j is dj units per year. The rate at which the inventory costs accumulate is known to 
be proportional to the investment in inventory at any time, with qj — 0.5, denoting 
the constant of proportionality for TV model j. Each TV set occupies an area of 
sj — 0.40 m 2 and the maximum storage space available is 90 m 2 . The data known from 
the past experience are given below. 




TV model j 


1 

2 

3 

Ordering cost, a ; - ($) 

50 

80 

100 

Unit cost, cj ($) 

40 

120 

80 

Demand rate, dj 

800 

400 

1200 


Formulate the problem of minimizing the average annual cost of ordering and storing 
the TV sets. 

SOLUTION Let xj denote the number of TV sets of model j ordered in each lot 
(j = 1 , 2, 3). Since the demand rate per year of model j is dj, the number of times 
the TV model j needs to be ordered is dj/xj. The cost of ordering TV model j per 
year is thus ajdj/xj, j = 1, 2, 3. The cost of storing TV sets of model j per year is 
qjCjXj/2 since the average level of inventory at any time during the year is equal to 
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CjXj/2. Thus the objective function (cost of ordering plus storing) can be expressed 


as 


f(X) - ( cndl + C I ]C]X] \ + ( + 
v Xt 2 ) \ x 2 

where the design vector X is given by 




M 3 <73^3^3 \ 


X3 


2 J 


X = 


The constraint on the worth of inventory can be stated as 

c\X\ + c 2 x 2 + C3JC3 < 45,000 
The limitation on the storage area is given by 

Ji-ti + 3-2X2 + 33x3 < 90 
Since the design variables cannot be negative, we have 

Xj> 0, j = 1,2,3 


(Hi) 


(E 2 ) 


(E 3 ) 


(E 4 ) 


(E 5 ) 


By substituting the known data, the optimization problem can be stated as follows: 
Find X which minimizes 


/(X) = 

subject to 


40,000 

xi 


10xi 


32,000 

x 2 


+ 30x 2 I + 


120,000 

x 3 


+ 20x3 


(E 6 ) 


gl (X) = 40xi + 120X2 + 80x 3 < 45,000 (E 7 ) 

g 2 (X) = 0.40(xi + x 2 + x 3 ) < 90 (Eg) 

g 3 (X) = -X!<0 (E 9 ) 

g 4 (X) = — x 2 < 0 (E10) 

g 5 (X) = -x 3 <0 (E„) 


It can be observed that the optimization problem stated in Eqs. (Eg) to (En) is a 
separable programming problem. 


1.5.8 Classification Based on the Number of Objective Functions 

Depending on the number of objective functions to be minimized, optimization prob- 
lems can be classified as single- and multiobjective programming problems. According 
to this classification, the problems considered in Examples 1 . 1 to 1 .9 are single objective 
programming problems. 
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Multiobjective Programming Problem. A multiobjective programming problem can 
be stated as follows: 

Find X which minimizes /i(X), / 2 (X), . . . , j\ (X ) 
subject to (1.13) 

g;(X)<0, j — 1,2, ... ,m 

where /i, / 2 , . . . , fk denote the objective functions to be minimized simultaneously. 

Example 1.10 A uniform column of rectangular cross section is to be constructed 
for supporting a water tank of mass M (Fig. 1.14). It is required (1) to minimize the 
mass of the column for economy, and (2) to maximize the natural frequency of trans- 
verse vibration of the system for avoiding possible resonance due to wind. Formulate 
the problem of designing the column to avoid failure due to direct compression and 
buckling. Assume the permissible compressive stress to be cr max - 

SOLUTION Let x\ — b and X 2 — d denote the cross-sectional dimensions of the 
column. The mass of the column (m) is given by 

m — pbdl = plx \xi (Ei) 

where p is the density and / is the height of the column. The natural frequency of 
transverse vibration of the water tank (to), by treating it as a cantilever beam with a 
tip mass M, can be obtained as [1.1 18] 


3EI 


n I / 2 


CO — 


(M 


—m)l 3 

140"' u 


(E 2 ) 



l 



77777777777 


Cross section of 
the column 


Figure 1.14 Water tank on a column. 
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where E is the Young’s modulus and I is the area moment of inertia of the column 
given by 


/ = jjbd 3 (E 3 ) 

The natural frequency of the water tank can be maximized by minimizing — co. With 
the help of Eqs. (Ei) and (E 3 ), Eq. (E2) can be rewritten as 

Ex \ *2 ] 1/2 

co — — ^ (E4) 

4/ 3 (M + jjoPlx ix 2 ) _ 

The direct compressive stress ( a c ) in the column due to the weight of the water tank 
is given by 


_ Mg _ Mg 
bd x\X2 

and the buckling stress for a fixed-free column (07,) is given by [1.121] 

/ tz 2 EI\ 1 n 2 Ex 2 
° b ~ ^ 4 Z 2 ) bd ~ 48 1 2 


(Es) 


(Ee) 


To avoid failure of the column, the direct stress has to be restricted to be less than cr max 
and the buckling stress has to be constrained to be greater than the direct compressive 
stress induced. 

Finally, the design variables have to be constrained to be positive. Thus the 
multiobjective optimization problem can be stated as follows: 


subject to 


Find X 



which minimizes 


/i(X) 

/ 2 (X) 


= plx\X 2 


Ex\x 3 


-|!/2 


4Z 2 (M + j^plxix 2 ) 


gi(X) 

g 2 (X) 

53 (X) 
g 4 (X) 


Mg 

X\X 2 

Mg 

X\X 2 


rtmax ^ 0 


7 T 2 EX 2 

m 2 


<0 


= —x\ < 0 
= —X 2 < 0 


(Ev) 

(Eg) 

(Eg) 

(E10) 

(En) 

(E12) 
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1.6 OPTIMIZATION TECHNIQUES 

The various techniques available for the solution of different types of optimization 
problems are given under the heading of mathematical programming techniques in 
Table 1.1. The classical methods of differential calculus can be used to find the uncon- 
strained maxima and minima of a function of several variables. These methods assume 
that the function is differentiable twice with respect to the design variables and the 
derivatives are continuous. For problems with equality constraints, the Lagrange multi- 
plier method can be used. If the problem has inequality constraints, the Kuhn-Tucker 
conditions can be used to identify the optimum point. But these methods lead to a set of 
nonlinear simultaneous equations that may be difficult to solve. The classical methods 
of optimization are discussed in Chapter 2. 

The techniques of nonlinear, linear, geometric, quadratic, or integer programming 
can be used for the solution of the particular class of problems indicated by the name 
of the technique. Most of these methods are numerical techniques wherein an approx- 
imate solution is sought by proceeding in an iterative manner by starting from an 
initial solution. Linear programming techniques are described in Chapters 3 and 4. The 
quadratic programming technique, as an extension of the linear programming approach, 
is discussed in Chapter 4. Since nonlinear programming is the most general method 
of optimization that can be used to solve any optimization problem, it is dealt with in 
detail in Chapters 5-7. The geometric and integer programming methods are discussed 
in Chapters 8 and 10, respectively. The dynamic programming technique, presented in 
Chapter 9, is also a numerical procedure that is useful primarily for the solution of 
optimal control problems. Stochastic programming deals with the solution of optimiza- 
tion problems in which some of the variables are described by probability distributions. 
This topic is discussed in Chapter 1 1 . 

In Chapter 12 we discuss calculus of variations, optimal control theory, and opti- 
mality criteria methods. The modern methods of optimization, including genetic algo- 
rithms, simulated annealing, particle swarm optimization, ant colony optimization, 
neural network-based optimization, and fuzzy optimization, are presented in Chapter 
13. Several practical aspects of optimization are outlined in Chapter 14. The reduction 
of size of optimization problems, fast reanalysis techniques, the efficient computation 
of the derivatives of static displacements and stresses, eigenvalues and eigenvectors, 
and transient response are outlined. The aspects of sensitivity of optimum solution to 
problem parameters, multilevel optimization, parallel processing, and multiobjective 
optimization are also presented in this chapter. 


1.7 ENGINEERING OPTIMIZATION LITERATURE 

The literature on engineering optimization is large and diverse. Several text-books 
are available and dozens of technical periodicals regularly publish papers related to 
engineering optimization. This is primarily because optimization is applicable to all 
areas of engineering. Researchers in many fields must be attentive to the developments 
in the theory and applications of optimization. 
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The most widely circulated journals that publish papers related to engineering opti- 
mization are Engineering Optimization, ASME Journal of Mechanical Design, AIAA 
Journal, ASCE Journal of Structural Engineering, Computers and Structures, Interna- 
tional Journal for Numerical Methods in Engineering, Structural Optimization, Journal 
of Optimization Theory and Applications, Computers and Operations Research, Oper- 
ations Research, Management Science, Evolutionary Computation, IEEE Transactions 
on Evolutionary Computation, European Journal of Operations Research, IEEE Trans- 
actions on Systems, Man and Cybernetics , and Journal of Heuristics . Many of these 
journals are cited in the chapter references. 


1.8 SOLUTION OF OPTIMIZATION PROBLEMS USING MATLAB 

The solution of most practical optimization problems requires the use of computers. 
Several commercial software systems are available to solve optimization problems that 
arise in different engineering areas. MATLAB is a popular software that is used for 
the solution of a variety of scientific and engineering problems. * MATLAB has several 
toolboxes each developed for the solution of problems from a specific scientific area. 
The specific toolbox of interest for solving optimization and related problems is called 
the optimization toolbox. It contains a library of programs or m-files, which can be 
used for the solution of minimization, equations, least squares curve fitting, and related 
problems. The basic information necessary for using the various programs can be found 
in the user’s guide for the optimization toolbox [1.124]. The programs or m-files, also 
called functions, available in the minimization section of the optimization toolbox are 
given in Table 1.2. The use of the programs listed in Table 1.2 is demonstrated at the end 
of different chapters of the book. Basically, the solution procedure involves three steps 
after formulating the optimization problem in the format required by the MATLAB 
program (or function) to be used. In most cases, this involves stating the objective 
function for minimization and the constraints in “<” form with zero or constant value 
on the righthand side of the inequalities. After this, step 1 involves writing an m-file 
for the objective function. Step 2 involves writing an m-file for the constraints. Step 3 
involves setting the various parameters at proper values depending on the characteristics 
of the problem and the desired output and creating an appropriate file to invoke the 
desired MATLAB program (and coupling the m-files created to define the objective and 
constraints functions of the problem). As an example, the use of the program, fmincon, 
for the solution of a constrained nonlinear programming problem is demonstrated in 
Example 1.11. 

Example 1.11 Find the solution of the following nonlinear optimization problem 
(same as the problem in Example 1.1) using the MATLAB function fmincon: 

Minimize /(.r i, xf) — 9.82xiX2 + 1x\ 

subject to 

2500 

g i(xi,X2) — 500 < 0 

HX\X2 


^The basic concepts and procedures of MATLAB are summarized in Appendix C. 
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Table 1.2 MATLAB Programs or Functions for Solving Optimization Problems 


Type of optimization 
problem 

Standard form for solution 
by MATLAB 

Name of MATLAB program 
or function to solve 
the problem 

Function of one variable or 

Find x to minimize f(x) 

fminbnd 

scalar minimization 

with x\ < x < x 2 


Unconstrained minimization 
of function of several 
variables 

Find X to minimize /(X) 

fminunc or fminsearch 

Linear programming 

Find X to minimize f r X 

linprog 

problem 

subject to 

[A]X < b, [A eq ]X = beq, 

1 < X < u 


Quadratic programming 

Find X to minimize 

quadprog 

problem 

^X r [//JX + f r X subject to 
[A]x < b, [A eq ]x = b eq , 

1 < X < u 


Minimization of function of 

Find X to minimize /(X) 

fmincon 

several variables subject 

subject to 


to constraints 

C(x) < 0. C eq = 0 

[A]X < b, [A eq ]X = beq, 

1 < X < u 


Goal attainment problem 

Find X and y to minimize y 
such that 

F(x) - viy < goal, 

C(X) < 0, Ceq = 0 
[A]X < b, [A eq ]X = b eq , 

1 < X < u 

f goalattain 

Minimax problem 

Minimize Max rE’f X \i 
X \FA '■ 

such that 

C(X) < 0, Ceq = 0 
[A]X < b, [A eq ]X = b eq , 

1 < X < u 



fminimax 

Binary integer programming 

Find X to minimize f T X 

bintprog 

problem 

subject to 

[A]x < b, [A eq ]x = b eq , 
each component of X is 
binary 



g 2 (xi,x 2 ) 


2500 7T-(X7+Xj) 

— — < 0 

7i x\x 2 0.5882 


g3(x l,x 2 ) = -X! +2 < 0 
g 4 (x i,x 2 ) = Xi - 14 < 0 
gs(xi,x 2 ) = —x 2 + 0.2 < 0 
g6<>i ,* 2 ) = x 2 -0.8 < 0 
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SOLUTION 

Step 1 : Write an M-file probofminobj .m for the objective function. 

function f= probofminobj (x) 
f= 9 . 82*x (1) *x (2) +2*x (1) ; 

Step 2: Write an M-file conprobformin .m for the constraints. 

function [c, ceq] = conprobformin (x) 

% Nonlinear inequality constraints 
c = [2500/ (pi*x(l)*x(2) )-500;2500 / (pi*x(l)*x(2) ) - 
(pi A 2 * (x(l) A 2+x(2) A 2) ) /0.5882;-x(l) +2;x(l)-14; -x (2 ) +0 . 2 ; 
x (2 ) -0 . 8 ] ; 

% Nonlinear equality constraints 
ceq = [ ] ; 

Step 3 : Invoke constrained optimization program (write this in new matlab file). 

clc 

clear all 
warning off 

xO = [7 0.4]; % Starting guess\ 

fprintf ('The values of function value and constraints 
at starting point\n'); 
f =probofminob j (xO) 

[c, ceq] = conprobformin (xO) 

options = optimset ( ' LargeScale ' , 'off'); 

[x, fval] =fmincon (0probofminob j , xO, [], [], [], [], [], 

[], gconprobf ormin, options) 

fprintf ('The values of constraints at optimum solution\n ' ) ; 

[c, ceq] = conprobformin (x) % Check the constraint values at x 

This produces the solution or output as follows: 

The values of function value and constraints at starting point 
f= 

41.4960 
c = 

-215 .7947 

-540.6668 

-5.0000 

-7 . 0000 

-0.2000 

-0.4000 

ceq = 

[] 

Optimization terminated: first-order optimality 
measure less 
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than options. TolFun and maximum constraint violation 
is less 

than options . TolCon . 

Active inequalities (to within options . TolCon = le-006) : 
lower upper ineqlin ineqnonlin 
1 
2 

x= 

5.4510 0.2920 
fval = 

26.5310 

The values of constraints at optimum solution 
c= 

- 0 . 0000 
- 0 . 0000 
-3 . 4510 
-8 . 5490 
-0 .0920 
-0 . 5080 
ceq = 

[] 
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REVIEW QUESTIONS 

1.1 Match the following terms and descriptions: 


(a) Free feasible point 

(b) Free infeasible point 

(c) Bound feasible point 

(d) Bound infeasible point 

(e) Active constraints 


gy(X)= 0 

Some gj(X) = 0 and other gj (X ) < 0 
Some gj(X) = 0 and other gj (X ) > 0 
Some gj(X) > 0 and other gj(X) < 0 
All gj (X) <0 


1.2 Answer true or false: 

(a) Optimization problems are also known as mathematical programming problems. 

(b) The number of equality constraints can be larger than the number of design variables. 

(c) Preassigned parameters are part of design data in a design optimization problem. 

(d) Side constraints are not related to the functionality of the system. 

(e) A bound design point can be infeasible. 

(f) It is necessary that some gj(X) = 0 at the optimum point. 

(g) An optimal control problem can be solved using dynamic programming techniques. 

(h) An integer programming problem is same as a discrete programming problem. 


1.3 Define the following terms: 

(a) Mathematical programming problem 

(b) Trajectory optimization problem 

(c) Behavior constraint 

(d) Quadratic programming problem 

(e) Posynomial 

(f) Geometric programming problem 


1.4 Match the following types of problems with their descriptions. 


(a) Geometric programming problem 

(b) Quadratic programming problem 

(c) Dynamic programming problem 

(d) Nonlinear programming problem 

(e) Calculus of variations problem 


Classical optimization problem 
Objective and constraints are quadratic 
Objective is quadratic and constraints are linear 
Objective and constraints arise from a serial 
system 

Objective and constraints are polynomials with 
positive coefficients 


1.5 How do you solve a maximization problem as a minimization problem? 
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1.6 State the linear programming problem in standard form. 

1.7 Define an OC problem and give an engineering example. 

1.8 What is the difference between linear and nonlinear programming problems? 

1.9 What is the difference between design variables and preassigned parameters? 

1.10 What is a design space? 

1.11 What is the difference between a constraint surface and a composite constraint surface? 

1.12 What is the difference between a bound point and a free point in the design space? 

1.13 What is a merit function? 

1.14 Suggest a simple method of handling multiple objectives in an optimization problem. 

1.15 What are objective function contours? 

1.16 What is operations research? 

1.17 State five engineering applications of optimization. 

1.18 What is an integer programming problem? 

1.19 What is graphical optimization, and what are its limitations? 

1.20 Under what conditions can a polynomial in n variables be called a posynomial? 

1.21 Define a stochastic programming problem and give two practical examples. 

1.22 What is a separable programming problem? 

PROBLEMS 

1.1 A fertilizer company purchases nitrates, phosphates, potash, and an inert chalk base at a 
cost of $1500, $500, $1000, and $100 per ton, respectively, and produces four fertilizers 
A, B, C, and D. The production cost, selling price, and composition of the four fertilizers 
are given below. 


Production Selling Percentage composition by weight 


Fertilizer 

cost 

($/ton) 

price 

($/ton) 

Nitrates 

Phosphates 

Potash 

Inert 

chalk base 

A 

100 

350 

5 

10 

5 

80 

B 

150 

550 

5 

15 

10 

70 

C 

200 

450 

10 

20 

10 

60 

D 

250 

700 

15 

5 

15 

65 


During any week, no more than 1000 tons of nitrate, 2000 tons of phosphates, and 
1500 tons of potash will be available. The company is required to supply a minimum 
of 5000 tons of fertilizer A and 4000 tons of fertilizer D per week to its customers; 
but it is otherwise free to produce the fertilizers in any quantities it pleases. Formulate 
the problem of finding the quantity of each fertilizer to be produced by the company to 
maximize its profit. 
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1.2 The two-bar truss shown in Fig. 1.15 is symmetric about the y axis. The nondimensional 
area of cross section of the members A/A re f, and the nondimensional position of joints 
1 and 2, x/h, are treated as the design variables x\ and xi, respectively, where A re f 
is the reference value of the area (A) and h is the height of the truss. The coordinates 
of joint 3 are held constant. The weight of the truss (/j) and the total displacement of 
joint 3 under the given load (yj) are to be minimized without exceeding the permissible 
stress, wo- The weight of the truss and the displacement of joint 3 can be expressed as 

fi (X) = 2 phx 2 ^ 1 + A ref 


/2(X) 


Ph(\ +xj) 1 - 5 y Jl +x\ 
2^/lEx\x2A Kf 


where p is the weight density, P the applied load, and E the Young’s modulus. The 
stresses induced in members 1 and 2 (o\ and < 72 ) are given by 


CTi(X) 


<T2<X) 


P{ 1 +Xl)yf (1 +Xj) 
2>/2xix 2 A re f 
P{X ! - 1)^(1+ X?) 

2y/lxiX2A Kf 


In addition, upper and lower bounds are placed on design variables x\ and X 2 as 


,.min 

x i 


< Xi 


i = 1,2 


Find the solution of the problem using a graphical method with (a) f\ as the objective, (b) /2 
as the objective, and (C) (/j + fi) as the objective for the following data: E = 30 x 10 6 psi, 
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p = 0.283 lb/in 3 , P = 10,000 lb, cr 0 = 20,000 psi, h = 100 in., A ref = 1 in 2 , xf n = 0.1, = 

0.1, r|“ = 2.0, and xf m = 2.5. 

1.3 Ten jobs are to be performed in an automobile assembly line as noted in the following 
table: 


Job 

Number 

Time required to 
complete the 
job (min) 

Jobs that must be 
completed before 
starting this job 

1 

4 

None 

2 

8 

None 

3 

7 

None 

4 

6 

None 

5 

3 

1, 3 

6 

5 

2, 3, 4 

7 

1 

5, 6 

8 

9 

6 

9 

2 

7, 8 

10 

8 

9 


It is required to set up a suitable number of workstations, with one worker assigned 
to each workstation, to perform certain jobs. Formulate the problem of determining the 
number of workstations and the particular jobs to be assigned to each workstation to 
minimize the idle time of the workers as an integer programming problem. Hint: Define 
variables x,j such that xij = 1 if job i is assigned to station j, and x ij = 0 otherwise. 

1.4 A railroad track of length L is to be constructed over an uneven terrain by adding or 
removing dirt (Fig. 1.16). The absolute value of the slope of the track is to be restricted 
to a value of r\ to avoid steep slopes. The absolute value of the rate of change of the 
slope is to be limited to a value rn to avoid rapid accelerations and decelerations. The 
absolute value of the second derivative of the slope is to be limited to a value of rj 



Figure 1.16 Railroad track on an uneven terrain. 
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to avoid severe jerks. Formulate the problem of finding the elevation of the track to 
minimize the construction costs as an OC problem. Assume the construction costs to be 
proportional to the amount of dirt added or removed. The elevation of the track is equal 
to a and b at x = 0 and x = L, respectively. 

1.5 A manufacturer of a particular product produces x\ units in the first week and X 2 units 
in the second week. The number of units produced in the first and second weeks must 
be at least 200 and 400, respectively, to be able to supply the regular customers. The 
initial inventory is zero and the manufacturer ceases to produce the product at the end 
of the second week. The production cost of a unit, in dollars, is given by 4xf, where x t 
is the number of units produced in week i(i = 1,2). In addition to the production cost, 
there is an inventory cost of $10 per unit for each unit produced in the first week that 
is not sold by the end of the first week. Formulate the problem of minimizing the total 
cost and find its solution using a graphical optimization method. 

1.6 Consider the slider-crank mechanism shown in Fig. 1.17 with the crank rotating at 
a constant angular velocity u>. Use a graphical procedure to find the lengths of the 
crank and the connecting rod to maximize the velocity of the slider at a crank angle of 
0 = 30° for u> = 100 rad/s. The mechanism has to satisfy Groshof s criterion l > 2.5r 
to ensure 360° rotation of the crank. Additional constraints on the mechanism are given 
by 0.5 < r < 10, 2.5 < / < 25, and 10 < x < 20. 

1.7 Solve Problem 1 .6 to maximize the acceleration (instead of the velocity) of the slider at 
0 = 30° for (D = 100 rad/s . 

1.8 It is required to stamp four circular disks of radii R\, AS. R;\ , and P 4 from a rectan- 
gular plate in a fabrication shop (Fig. 1.18). Formulate the problem as an optimization 
problem to minimize the scrap. Identify the design variables, objective function, and the 
constraints. 


1.9 


The torque transmitted (T) by a cone clutch, shown in Fig. 1.19, under uniform pressure 
condition is given by 


T = 


3 sin a 


R\) 


where p is the pressure between the cone and the cup, / the coefficient of friction, a 
the cone angle, R i the outer radius, and AS the inner radius. 

(a) Find R\ and 7G that minimize the volume of the cone clutch with a = 30°, 
F = 30 lb, and / = 0.5 under the constraints T > 100 lb-in., R\ > 2/G, 
0 < R\ < 15 in., and 0 < Ri < 10 in. 


Crank, length /• 



Figure 1.17 Slider-crank mechanism. 
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y 



X 


Figure 1.18 Locations of circular disks in a rectangular plate. 


p.dA 




(b) What is the solution if the constraint R\ ^ 2/?2 is changed to R\ ^ 2/?2? 

(C) Find the solution of the problem stated in part (a) by assuming a uniform wear 
condition between the cup and the cone. The torque transmitted (T) under uniform 
wear condition is given by 


T = nfpR2 (Ri - R\) 


Note: Use graphical optimization for the solutions. 


Problems 51 


1.10 A hollow circular shaft is to be designed for minimum weight to achieve a minimum 
reliability of 0.99 when subjected to a random torque of (T , a T ) = (10 6 , 10 4 ) lb-in., 
where T is the mean torque and a T is the standard deviation of the torque, T. The 
permissible shear stress, ro, of the material is given by (to, cr r o) = (50,000, 5000) psi, 
where to is the mean value and er T o is the standard deviation of to. The maximum 
induced stress (r) in the shaft is given by 

r = — 

T / 


where r Q is the outer radius and J is the polar moment of inertia of the cross section 
of the shaft. The manufacturing tolerances on the inner and outer radii of the shaft are 
specified as ±0.06 in. The length of the shaft is given by 50 ± 1 in. and the specific 
weight of the material by 0.3 ± 0.03 lb/in 3 . Formulate the optimization problem and 
solve it using a graphical procedure. Assume normal distribution for all the random 
variables and 3er values for the specified tolerances. Hints: (1) The minimum reliability 
requirement of 0.99 can be expressed, equivalently, as [1.120] 


zi = 2.326 < 


r - r 0 



(2) If f(x\, X 2 , ■ ■ . , x„) is a function of the random variables x\,X 2 , . . . , x„, the mean 
value of /(/) and the standard deviation of /(or/) are given by 


/ = fix i,x 2 , ■ ■ . ,x„) 


°7 = 


T[ — 

^ 1 dxi 


x l ,X2,—,x„ 


2 n !/ 2 
2 


where x; is the mean value of x,-, and a xi is the standard deviation of x*. 


1.11 Certain nonseparable optimization problems can be reduced to a separable form by 
using suitable transformation of variables. For example, the product term / = X1X2 can 
be reduced to the separable form / = y 2 — y| by introducing the transformations 

yi = \ix 1 + X 2 ), y 2 = \{x\ - X 2 ) 

Suggest suitable transformations to reduce the following terms to separable form: 

(a) / = x\x\ y x\ > 0, X 2 > 0 

(b) f =xf, xi >0 


1.12 In the design of a shell-and-tube heat exchanger (Fig. 1.20), it is decided to have the total 
length of tubes equal to at least oq [1.10], The cost of the tube is ot 2 per unit length and 
the cost of the shell is given by oq D 25 L, where D is the diameter and L is the length of 
the heat exchanger shell. The floor space occupied by the heat exchanger costs 1*4 per unit 
area and the cost of pumping cold fluid is a^L/d 5 N 2 per day, where d is the diameter 
of the tube and N is the number of tubes. The maintenance cost is given by a^NdL. 
The thermal energy transferred to the cold fluid is given by aq /N l2 dL lA + a$/d 02 L. 
Formulate the mathematical programming problem of minimizing the overall cost of the 
heat exchanger with the constraint that the thermal energy transferred be greater than 
a specified amount 0:9. The expected life of the heat exchanger is aqo years. Assume 

that a,-, i = 1, 2 10, are known constants, and each tube occupies a cross-sectional 

square of width and depth equal to d. 
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Figure 1.21 Electrical bridge network. 


1.13 The bridge network shown in Fig. 1.21 consists of live resistors R, (i = 1, 2, . . . , 5). 
If /, is the current flowing through the resistance R, , the problem is to find the resistances 
R\, Ro, . . . , R5 so that the total power dissipated by the network is a minimum. The 
current /, can vary between the lower and upper limits i„ and /;, ma x, and the voltage 
drop, Vi = Rjlj, must be equal to a constant c, for 1 < i < 5. Formulate the problem as 
a mathematical programming problem. 

1.14 A traveling saleswoman has to cover n towns. She plans to start from a particular town 
numbered 1, visit each of the other n — 1 towns, and return to the town 1. The distance 
between towns i and j is given by djj . Formulate the problem of selecting the sequence 
in which the towns are to be visited to minimize the total distance traveled. 

1.15 A farmer has a choice of planting barley, oats, rice, or wheat on his 200-acre farm. The 
labor, water, and fertilizer requirements, yields per acre, and selling prices are given in 
the following table: 


Type of 
crop 

Labor 

cost 

($) 

Water 

required 

(m 3 ) 

Fertilizer 

required 

(lb) 

Yield 

(lb) 

Selling 

price 

($/lb) 

Barley 

300 

10,000 

100 

1,500 

0.5 

Oats 

200 

7,000 

120 

3,000 

0.2 

Rice 

250 

6,000 

160 

2,500 

0.3 

Wheat 

360 

8,000 

200 

2,000 

0.4 


The farmer can also give part or all of the land for lease, in which case he gets $200 per 
acre. The cost of water is $0.02/m 3 and the cost of the fertilizer is $2/lb. Assume that 
the farmer has no money to start with and can get a maximum loan of $50,000 from the 
land mortgage bank at an interest of 8 %. He can repay the loan after six months. The 
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irrigation canal cannot supply more than 4 x 10 5 m 3 of water. Formulate the problem of 
finding the planting schedule for maximizing the expected returns of the farmer. 

1.16 There are two different sites, each with four possible targets (or depths) to drill an oil 
well. The preparation cost for each site and the cost of drilling at site i to target j are 
given below: 




Drilling cost to target j 



Site i 

1 

2 

3 

4 

Preparation cost 

1 

4 

1 

9 

7 

11 

2 

7 

9 

5 

2 

13 


Formulate the problem of determining the best site for each target so that the total cost 
is minimized. 

1.17 A four-pole dc motor, whose cross section is shown in Fig. 1.22, is to be designed with 
the length of the stator and rotor x\, the overall diameter of the motor X 2 , the unnotched 
radius x$, the depth of the notches xa, and the ampere turns xs as design variables. 


Slots (to house armature winding) 



Figure 1.22 Cross section of an idealized motor. 
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The air gap is to be less than k\ *Jxi + 7.5 where k\ is a constant. The temperature of 
the external surface of the motor cannot exceed AT above the ambient temperature. 
Assuming that the heat can be dissipated only by radiation, formulate the problem for 
maximizing the power of the motor [1.59]. Hints: 

1. The heat generated due to current flow is given by where &2 is a 

constant. The heat radiated from the external surface for a temperature difference of 
AT is given by k^xixiAT , where k$ is a constant. 

2. The expression for power is given by k^NBxix^xs, where k 4 is a constant, N is the 
rotational speed of the rotor, and B is the average flux density in the air gap. 

3. The units of the various quantities are as follows. Lengths: centimeter, heat generated, 
heat dissipated; power: watt; temperature: °C; rotational speed: rpm; flux density: 
gauss. 

1.18 A gas pipeline is to be laid between two cities A and E , making it pass through one 
of the four locations in each of the intermediate towns B, C, and D (Fig. 1.23). The 
associated costs are indicated in the following tables. 

Costs for A to B and D to E 




Station i 



1 

2 3 

4 

From A to point i of B 

30 

35 25 

40 

From point i of D to E 

50 

40 35 

25 


Costs for B to C and C to D 


From: 


To: 



1 

2 

3 

4 

1 

22 

18 

24 

18 

2 

35 

25 

15 

21 

3 

24 

20 

26 

20 

4 

22 

21 

23 

22 



Town B Town C Town D 


Figure 1.23 Possible paths of the pipeline between A and E. 
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Formulate the problem of minimizing the cost of the pipeline. 

1.19 A beam-column of rectangular cross section is required to carry an axial load of 25 lb 
and a transverse load of 101b, as shown in Fig. 1.24. It is to be designed to avoid the 
possibility of yielding and buckling and for minimum weight. Formulate the optimization 
problem by assuming that the beam-column can bend only in the vertical (xy) plane. 
Assume the material to be steel with a specific weight of 0.3 lb/in 3 . Young’s modulus of 
30 x 10 6 psi, and a yield stress of 30,000 psi. The width of the beam is required to be at 
least 0.5 in. and not greater than twice the depth. Also, find the solution of the problem 
graphically. Hint: The compressive stress in the beam-column due to P y is P y /bd and 
that due to P x is 

PJd 6P X I 
2^7 = ~bcH 

The axial buckling load is given by 

ti 2 EI z , n 2 Ebd 3 

{Py)cd = ~ 4 /^ = 48/2 

1.20 A two-bar truss is to be designed to carry a load of 1W as shown in Fig. 1.25. Both 
bars have a tubular section with mean diameter d and wall thickness t. The material 
of the bars has Young’s modulus E and yield stress a y . The design problem involves 
the determination of the values of d and t so that the weight of the truss is a minimum 
and neither yielding nor buckling occurs in any of the bars. Formulate the problem as a 
nonlinear programming problem. 



Figure 1.25 Two-bar truss. 
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Figure 1.26 Processing plant layout (coordinates in ft). 


x 


1.21 Consider the problem of determining the economic lot sizes for four different items. 
Assume that the demand occurs at a constant rate over time. The stock for the 
(th item is replenished instantaneously upon request in lots of sizes Qi . The total 
storage space available is A, whereas each unit of item i occupies an area dj. The 
objective is to find the values of <2, that optimize the per unit cost of holding the 
inventory and of ordering subject to the storage area constraint. The cost function is 
given by 

C = ^(|- + h,G,), Qi> 0 

where a,- and bj are fixed constants. Formulate the problem as a dynamic programming 
(optimal control) model. Assume that Qi is discrete. 

1.22 The layout of a processing plant, consisting of a pump (P), a water tank ( T ), a com- 
pressor (C), and a fan ( F ), is shown in Fig. 1.26. The locations of the various units, in 
terms of their (x, y) coordinates, are also indicated in this figure. It is decided to add a 
new unit, a heat exchanger (H), to the plant. To avoid congestion, it is decided to locate 
H within a rectangular area defined by {—15 < x < 15,-10 < y < 10). Formulate the 
problem of finding the location of H to minimize the sum of its x and y distances from 
the existing units, P, T. C, and F. 

1.23 Two copper-based alloys (brasses), A and B, are mixed to produce a new alloy, C. 
The composition of alloys A and B and the requirements of alloy C are given in the 
following table: 
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Composition by weight 


Alloy 

Copper 

Zinc 

Lead 

Tin 

A 

80 

10 

6 

4 

B 

60 

20 

18 

2 

C 

> 75 

> 15 

> 16 

> 3 


If alloy B costs twice as much as alloy A, formulate the problem of determining the 
amounts of A and B to be mixed to produce alloy C at a minimum cost. 

1.24 An oil refinery produces four grades of motor oil in three process plants. The refinery 
incurs a penalty for not meeting the demand of any particular grade of motor oil. The 
capacities of the plants, the production costs, the demands of the various grades of motor 
oil, and the penalties are given in the following table: 


Process 

plant 

Capacity of the plant 
(kgal/day) 

Production cost ($/day) to 
manufacture motor oil of grade: 

1 

2 

3 

4 

1 

100 

750 

900 

1000 

1200 

2 

150 

800 

950 

1100 

1400 

3 

200 

900 

1000 

1200 

1600 

Demand (kgal/day) 


50 

150 

100 

75 

Penalty (per each kilogallon shortage) 

$10 

$12 

$16 

$20 


Formulate the problem of minimizing the overall cost as an LP problem. 

1.25 A part-time graduate student in engineering is enrolled in a four-unit mathematics course 
and a three-unit design course. Since the student has to work for 20 hours a week at a 
local software company, he can spend a maximum of 40 hours a week to study outside 
the class. It is known from students who took the courses previously that the numerical 
grade (g) in each course is related to the study time spent outside the class as g m = t m / 6 
and gj = tj/5, where g indicates the numerical grade (g = 4 for A, 3 for B, 2 for C, 1 for 
D, and 0 for F), t represents the time spent in hours per week to study outside the class, 
and the subscripts m and d denote the courses, mathematics and design, respectively. 
The student enjoys design more than mathematics and hence would like to spend at least 
75 minutes to study for design for every 60 minutes he spends to study mathematics. 
Also, as far as possible, the student does not want to spend more time on any course 
beyond the time required to earn a grade of A. The student wishes to maximize his grade 
point P, given by P = 4 g,„ + 3 gj, by suitably distributing his study time. Formulate 
the problem as an LP problem. 

1.26 The scaffolding system, shown in Fig. 1.27, is used to carry a load of 10,000 lb. Assuming 
that the weights of the beams and the ropes are negligible, formulate the problem of 
determining the values of x\,X 2 ,xs, and xn to minimize the tension in ropes A and B 
while maintaining positive tensions in ropes C, D, E, and F. 

1.27 Formulate the problem of minimum weight design of a power screw subjected to an 
axial load, F, as shown in Fig. 1.28 using the pitch (p), major diameter (d), nut height 
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Figure 1.27 Scaffolding system. 



Figure 1.28 Power screw. 

(/;), and screw length ( s ) as design variables. Consider the following constraints in the 

formulation: 

1. The screw should be self-locking [1.117], 

2. The shear stress in the screw should not exceed the yield strength of the material in 
shear. Assume the shear strength in shear (according to distortion energy theory), to 
be 0.577cr y , where a y is the yield strength of the material. 

3. The bearing stress in the threads should not exceed the yield strength of the material, 
ay. 

4. The critical buckling load of the screw should be less than the applied load, F. 

1.28 (a) A simply supported beam of hollow rectangular section is to be designed for mini- 

mum weight to carry a vertical load F y and an axial load P as shown in Fig. 1.29. 
The deflection of the beam in the y direction under the self-weight and F y should 
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Figure 1.29 Simply supported beam under loads. 


not exceed 0.5 in. The beam should not buckle either in the yz or the xz plane under 
the axial load. Assuming the ends of the beam to be pin ended, formulate the opti- 
mization problem using x,-, i = 1, 2, 3, 4 as design variables for the following data: 
F y = 300 lb, P = 40,000 lb, / = 120 in., E = 30 x 10 6 psi, p = 0.284 lb/in 3 , lower 
bound on x\ and X 2 = 0.125 in, upper bound on xi, and xi = 4 in. 

(b) Formulate the problem stated in part (a) using x\ and X 2 as design variables, assuming 
the beam to have a solid rectangular cross section. Also find the solution of the 
problem using a graphical technique. 

1.29 A cylindrical pressure vessel with hemispherical ends (Fig. 1.30) is required to hold 
at least 20,000 gallons of a fluid under a pressure of 2500 psia. The thicknesses of 
the cylindrical and hemispherical parts of the shell should be equal to at least those 
recommended by section VIII of the ASME pressure vessel code, which are given by 


Se + OAp 

= pR 
Se + 0.8 p 


x 2 



Figure 1.30 Pressure vessel. 
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(a) 

Figure 1.31 Crane hook carrying a load. 


where S is the yield strength, e the joint efficiency, p the pressure, and R the radius. 
Formulate the design problem for minimum structural volume using x;, i = 1, 2, 3, 4. as 
design variables. Assume the following data: S = 30,000 psi and e = 1.0. 

1.30 A crane hook is to be designed to carry a load F as shown in Fig. 1.31. The hook can 
be modeled as a three-quarter circular ring with a rectangular cross section. The stresses 
induced at the inner and outer fibers at section AB should not exceed the yield strength 
of the material. Formulate the problem of minimum volume design of the hook using 
r„, ri, b, and h as design variables. Note: The stresses induced at points A and B are 
given by LI. 117] 


Mc„ 

a a = 

A.CV q 
Mci 


where M is the bending moment due to the load (= FR ), R the radius of the centroid, 
r D the radius of the outer fiber, n the radius of the inner fiber, c 0 the distance of the 
outer fiber from the neutral axis = R a — r„ , c, the distance of inner fiber from neutral 
axis = r„ — rj , r„ the radius of neutral axis, given by 

h 

In (r B /n) 

A the cross-sectional area of the hook = bh, and e the distance between the centroidal 
and neutral axes = R — r n . 


1.31 Consider the four-bar truss shown in Fig. 1.32, in which members 1, 2, and 3 have 
the same cross-sectional area x\ and the same length Z, while member 4 has an area of 
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cross section x 2 and length V3 /. The truss is made of a lightweight material for which 
Young’s modulus and the weight density are given by 30 x 10 6 psi and 0.03333 lb/in 3 , 
respectively. The truss is subject to the loads Pi = 10,000 lb and P 2 = 20,000 lb. The 
weight of the truss per unit value of / can be expressed as 

/ = 3xi(l)(0.03333) +x 2 V3(0.03333) = O.lxi + 0.05773x 2 

The vertical deflection of joint A can be expressed as 

0.6 0.3464 

5a — 1 

*1 x 2 

and the stresses in members 1 and 4 can be written as 

5(10,000) 50,000 -2^3(10,000) 34,640 

ffi = = , 04 = = 

X\ X\ Xl Xl 

The weight of the truss is to be minimized with constraints on the vertical deflection of 
the joint A and the stresses in members 1 and 4. The maximum permissible deflection 
of joint A is 0.1 in. and the permissible stresses in members are cr max = 8333.3333 psi 
(tension) and cr ITlin = —4948.5714 psi (compression). The optimization problem can be 
stated as a separable programming problem as follows: 

Minimize /(x i, x 2 ) = O.lxi + 0.05773x 2 


subject to 


0.6 

xi 


0.3464 

0.1 < 0, 6 — xi < 0, 7 — x 2 < 0 

* 2 


Determine the solution of the problem using a graphical procedure. 

1.32 A simply supported beam, with a uniform rectangular cross section, is subjected to both 
distributed and concentrated loads as shown in Fig. 1.33. It is desired to find the cross 
section of the beam to minimize the weight of the beam while ensuring that the maximum 
stress induced in the beam does not exceed the permissible stress (op) of the material 
and the maximum deflection of the beam does not exceed a specified limit (5 q)- 
The data of the problem are P = 10 5 N, po = 10 6 N/m, L = 1 m, E = 207 GPa, weight 
density ( p w ) = 76.5 kN/m 3 , op = 220 MPa, and 5p = 0.02 m. 
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P 





Cross-section 

Figure 1.33 A simply supported beam subjected to concentrated and distributed loads. 

(a) Formulate the problem as a mathematical programming problem assuming that 
the cross-sectional dimensions of the beam are restricted as x\ < xi, 0.04m < x\ 
< 0.12m, and 0.06m < X 2 < 0.20 m. 

(b) Find the solution of the problem formulated in part (a) using MATLAB. 

(c) Find the solution of the problem formulated in part (a) graphically. 

1.33 Solve Problem 1.32, parts (a), (b), and (c), assuming the cross section of the beam to 
be hollow circular with inner diameter x\ and outer diameter xi. Assume the data and 
bounds on the design variables to be as given in Problem 1.32. 

1.34 Find the solution of Problem 1.31 using MATLAB. 

1.35 Find the solution of Problem 1.2(a) using MATLAB. 

1.36 Find the solution of Problem 1.2(b) using MATLAB. 
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Classical Optimization Techniques 


2.1 INTRODUCTION 

The classical methods of optimization are useful in finding the optimum solution of 
continuous and differentiable functions. These methods are analytical and make use 
of the techniques of differential calculus in locating the optimum points. Since some 
of the practical problems involve objective functions that are not continuous and/or 
differentiable, the classical optimization techniques have limited scope in practical 
applications. However, a study of the calculus methods of optimization forms a basis for 
developing most of the numerical techniques of optimization presented in subsequent 
chapters. In this chapter we present the necessary and sufficient conditions in locating 
the optimum solution of a single-variable function, a multivariable function with no 
constraints, and a multivariable function with equality and inequality constraints. 


2.2 SINGLE-VARIABLE OPTIMIZATION 

A function of one variable fix) is said to have a relative or local minimum at x — 
x* if f(x*) < f(x* + h) for all sufficiently small positive and negative values of h. 
Similarly, a point x* is called a relative or local maximum if fix*) > fix'* + li) for 
all values of h sufficiently close to zero. A function fix) is said to have a global 
or absolute minimum at x* if fix*) < f (x ) for all x, and not just for all x close to 
x*, in the domain over which fix) is defined. Similarly, a point x* will be a global 
maximum of fix) if fix*) > f (x ) for all x in the domain. Figure 2.1 shows the 
difference between the local and global optimum points. 

A single-variable optimization problem is one in which the value of x = x* is to be 
found in the interval [ a , b] such that x* minimizes f (x ) . The following two theorems 
provide the necessary and sufficient conditions for the relative minimum of a function 
of a single variable. 

Theorem 2.1 Necessary Condition If a function fix) is defined in the interval a < 
x < b and has a relative minimum at x — x * , where a < x* < b, and if the derivative 
df(x)/dx — f{x) exists as a finite number at x = x*, then fix*) = 0. 


Proof : It is given that 


fix*) = lim 
A— >o 


fjx* + h)~ fix*) 
h 


(2.1) 


Engineering Optimization: Theory and Practice, Fourth Edition Si ngi resu S. Rao 
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A i, A2, A3 = Relative maxima 
A2 = Global maximum 




Figure 2.1 Relative and global minima. 


exists as a definite number, which we want to prove to be zero. Since x* is a relative 
minimum, we have 

fix*) < f(x*+h) 


for all values of h sufficiently close to zero. Hence 

f(x* + h) - fix*) 


h 

fjx* +h)~ fjx*) 
h 


> 0 if h > 0 


< 0 if h < 0 


Thus Eq. (2.1) gives the limit as h tends to zero through positive values as 

fix*) > 0 


while it gives the limit as h tends to zero through negative values as 

fix*) < 0 

The only way to satisfy both Eqs. (2.2) and (2.3) is to have 

fix*) = 0 


(2.2) 

(2.3) 

(2.4) 


This proves the theorem. 


Notes: 


1. This theorem can be proved even if x* is a relative maximum. 

2. The theorem does not say what happens if a minimum or maximum occurs at 
a point x* where the derivative fails to exist. For example, in Fig. 2.2, 


lim 

h^O 


fix* + h)~ fjx*) 
h 


— m + (positive) or m (negative) 


depending on whether h approaches zero through positive or negative values, 
respectively. Unless the numbers m + and m~ are equal, the derivative fix*) 
does not exist. If fix*) does not exist, the theorem is not applicable. 


fix) 

A 
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x 


■* 


x 


Figure 2.2 Derivative undefined at x*. 


3. The theorem does not say what happens if a minimum or maximum occurs at 
an endpoint of the interval of definition of the function. In this case 

f(pc* + h)-f(x*) 

lim 

h^Q h 

exists for positive values of h only or for negative values of h only, and hence 
the derivative is not defined at the endpoints. 

4. The theorem does not say that the function necessarily will have a minimum 
or maximum at every point where the derivative is zero. For example, the 
derivative f'(x ) = 0 at x = 0 lor the function shown in Fig. 2.3. However, this 
point is neither a minimum nor a maximum. In general, a point x* at which 
fix*) — 0 is called a stationary point. 

If the function f{x) possesses continuous derivatives of every order that come in 
question, in the neighborhood of x = x*, the following theorem provides the sufficient 
condition for the minimum or maximum value of the function. 


fix) 



Figure 2.3 Stationary (inflection) point. 
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Theorem 2.2 Sufficient Condition Let fix*) = f"(x*) = ■■■ = / (,!_1) (x*) = 0, 

but f n) ix*) 0. Then fix*) is (i) a minimum value of fix) if f <n> (x*) >0 and n 
is even; (ii) a maximum value of fix) if / (n) (x*) < 0 and n is even; (iii) neither a 
maximum nor a minimum if n is odd. 

Proof -. Applying Taylor’s theorem with remainder after n terms, we have 

f(x* + h) =f(x*) + hfix *) + h - fix *) + ■ ■ ■ + /" 1 /fr-^V) 

2 ! ( n — 1 )! 

+ — f\x* + 0h) for 0 < e < 1 (2.5) 

n\ 

Since f(x*) = fix*) = ■■■ = f~ l \x*) = 0, Eq. (2.5) becomes 

fix* + h) - fix*) = — y f\x* + 9h) 
n\ 

As f (n \x*) f 0, there exists an interval around x* for every point x of which the nth 
derivative / (n) (x) has the same sign, namely, that of f <n> (x*). Thus for every point 
x* + h of this interval, / <n) (x* + Oh) has the sign of / (n) (x*). When » is even, h n /n\ is 
positive irrespective of whether h is positive or negative, and hence fix* + h) — fix*) 
will have the same sign as that of / (,,) (x*). Thus x* will be a relative minimum if 
f in> (x*) is positive and a relative maximum if f (n Hx*) is negative. When n is odd, 
h n /n\ changes sign with the change in the sign of h and hence the point x* is neither 
a maximum nor a minimum. In this case the point x* is called a point of inflection . 

Example 2.1 Determine the maximum and minimum values of the function 

fix) = 12x 5 - 45x 4 + 40x 3 + 5 

SOLUTION Since fix) — 60(x 4 — 3x 3 + 2x 2 ) = 60x 2 (x — l)(x — 2), fix) — 0 at 
x = 0, x = 1, and x = 2. The second derivative is 

fix) = 60(4x 3 - 9x 2 + 4x) 

At x = 1, fix) — —60 and hence x = 1 is a relative maximum. Therefore, 

/max = fix = 1) = 12 

At x = 2, fix) — 240 and hence x = 2 is a relative minimum. Therefore, 

/min = /(*=2) = -ll 

At x = 0, fix) — 0 and hence we must investigate the next derivative: 
fix) = 60(12x 2 - 18x + 4) =240 at x=0 


Since fix) ^ 0 at x = 0, x = 0 is neither a maximum nor a minimum, and it is an 
inflection point. 
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Example 2.2 In a two-stage compressor, the working gas leaving the first stage of 
compression is cooled (by passing it through a heat exchanger) before it enters the 
second stage of compression to increase the efficiency [2.13]. The total work input to 
a compressor (W) for an ideal gas, for isentropic compression, is given by 


W = c p T l 


m v k ~ i),k 

Pi) 



where c p is the specific heat of the gas at constant pressure, k is the ratio of specific 
heat at constant pressure to that at constant volume of the gas, and 7) is the temperature 
at which the gas enters the compressor. Find the pressure, p 2 , at which intercooling 
should be done to minimize the work input to the compressor. Also determine the 
minimum work done on the compressor. 


SOLUTION The necessary condition for minimizing the work done on the compres- 
sor is 


dW _ k 
dp2 P 1 k — 1 


1 \ (*-!)/**_ I 


Pi 


-(pi) 


-l/k 


+ {p ^-D/ k Jl±l {p2 )(l- 2k)/ k 


which yields 


P2 = (PiP3) 1/2 
The second derivative of W with respect to p 2 gives 

drW 


dpi 


c P T\ 


| \ (k~l)/k | 

T it*)-™* 


- (P3) (k-i)/kLJ± ( p 2) v-ik)/k 


d z W 


2c„T\ - 


k - 1 


(3Jfc— 1)/2* (k+l)/2k 


dPl / p 2 = (plp2)l/ 2 P'\ P'-i 

Since the ratio of specific heats k is greater than 1, we get 

d 2 W 


dpi 


>0 at p 2 = (PiP3) 


1/2 


and hence the solution corresponds to a relative minimum. The minimum work done 
is given by 


Wmin = 2c p T x 


k - 1 


(k—\)/2k 


— 1 
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2.3 MULTIVARIABLE OPTIMIZATION WITH NO CONSTRAINTS 


In this section we consider the necessary and sufficient conditions for the minimum 
or maximum of an unconstrained function of several variables. Before seeing these 
conditions, we consider the Taylor’s series expansion of a multivariable function. 


Definition: rth Differential of f. If all partial derivatives of the function / through 
order r > 1 exist and are continuous at a point X*, the polynomial 


n n n Mr /VV*\ 

W) = EE 


;=1 7=1 k= 1 


3 Xk 


(2.6) 


r summations 


is called the rth differential of / at X*. Notice that there are r summations and one /r, 
is associated with each summation in Eq. (2.6). 


For example, when r — 2 and n — 3, we have 

3 3 „ 2 


d 2 f(X*) = d 2 f(x*, xf, xf) = Y^^hihj 

i = 1 7 = 1 

,9 2 f id 2 f 

'- 2 J (\*) + h 2 —L(\*) + h- 


d 2 f (X*) 
dx; dx j 


! 3 2 / 


1 ' 1 - 2 2 8x 2 3 dx? 


dxi 


(X*) 


d 2 f d 2 f 

+ 2 hfi2 — (X*) +2 h 2 h 3 — (X*) + 2 hfi 3 - 


3 2 f 


(X*) 


'3xi3x2 “ ' 8x28x3 8x18x3 

The Taylor’s series expansion of a function /(X) about a point X* is given by 

/(X) =/(X*) + rf/(X*) + i d 2 f(X *) + i fif 3 /(X *) 


1 

ivi 


^/(X*) + fl/v(X*,h) 


where the last term, called the remainder, is given by 

1 


*;v(X*,h) = 


(N + 1)! 


d N+l f(\* + 6»h) 


(2.7) 


(2.8) 


where 0 < 6 < 1 and h = X — X*. 

Example 2.3 Find the second-order Taylor’s series approximation of the function 

/(X 1; X 2 ,X 3 ) = X 2 X 3 -t-Xi^ 3 

about the point X* = {1,0, — 2} T . 

SOLUTION The second-order Taylor’s series approximation of the function / about 
point X* is given by 


/ (X ) = / | 0 \ + df\ o|+ij 2 / 
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where 


/ I 0\=e 


-2 


P 


9/ 


r 


9/ 


p 


9/ 


P 


<// 0 =/n-^ 0 +*2 7r- 0 +*3^- 0 


dx 


dx 2 


9X3 


P 


= [h je A ' 3 + h 2 (2x 2 x 3 ) + /13X2 + /t3Xie VJ ] | 0 | = /tie " + /t3e 

, —2 , 


jc3 n 


—2 


—2 


3 3 


d2 f\ 0 =EE^ 


<■= 1 7=1 


9 2 / 

9x,- 9Xj 


■ / 2 9 2 / ,9 2 / 0 9 2 / 

0 | = [ h \ — - + /z 2 — 2" ^3 — t 

l 1 gjj-2 z 3v _2 J a v 2 


9x^ 


dx x 


+ 2h\h 2 


9 2 / 

9xi9x2 


+ 2h 2 h 3 


9 2 / 

9X29X3 


+ 2h\h 3 


9 




9xi9x 3 / l 


— [/t ] (0) + h 2 { 2x3) + /t 2 (xie v3 ) + 2h\h 2 (0) + 2h 2 h 3 (2x 2 ) 


+ 2h l h 3 (e x3 )] | 0 | = -4/p + e“ z /p + 2h 3 h 3 e 

, —2 1 


2 v.2 


—2 


Thus the Taylor’s series approximation is given by 

/(X) ~ e~ 2 + e~ 2 (h\ + h 3 ) + l(-4 h\ + e~ 2 h 2 + 2 hi h 3 e~ 2 ) 


where h\ = x\ — 1, h 2 = X2, and h 3 = X3 + 2. 


Theorem 2.3 Necessary Condition If /(X) has an extreme point (maximum or min- 
imum) at X = X* and if the first partial derivatives of /(X) exist at X*, then 

|^(X*) = ^(X*) = --- = ^(X*) = 0 (2.9) 

OX 1 OX 2 ox n 

Proof : The proof given for Theorem 2.1 can easily be extended to prove the present 
theorem. However, we present a different approach to prove this theorem. Suppose that 
one of the first partial derivatives, say the Z:th one, does not vanish at X*. Then, by 
Taylor’s theorem. 
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that is. 


/(X* + h) - /(X*) = h k ^-(\*) + if/ 2 /(X* + eh), o < e < 1 

axk 2! 

Since d 2 f(X* + 0h) is of order h 2 , the terms of order h will dominate the higher-order 
terms for small h. Thus the sign of /(X* + h) — /(X*) is decided by the sign of 
h k 3/(X*)/ dx k . Suppose that 3/(X*)/ dx k >0. Then the sign of /(X* + h) — /(X*) 
will be positive for h k >0 and negative for h k < 0. This means that X* cannot be 
an extreme point. The same conclusion can be obtained even if we assume that 
df(X*)/dx k < 0. Since this conclusion is in contradiction with the original statement 
that X* is an extreme point, we may say that df/dx k — 0 at X = X*. Hence the theorem 
is proved. 

Theorem 2.4 Sufficient Condition A sufficient condition for a stationary point X* 
to be an extreme point is that the matrix of second partial derivatives (Hessian matrix) 
of /(X) evaluated at X* is (i) positive definite when X* is a relative minimum point, 
and (ii) negative definite when X * is a relative maximum point. 


Proof-. From Taylor’s theorem we can write 


ii r\ * n n q2 r 

/(X* + h) = /(X*) + £ A,- g|(X*) + - E E h ' h J 


i = 1 

o < e < i 


i=i j=i 


dxj dxj 


X=X*+0h 


( 2 . 10 ) 


Since X* is a stationary point, the necessary conditions give (Theorem 2.3) 


3/ 

— =0, i — \, 2, ... ,n 
dXi 


Thus Eq. (2.10) reduces to 


1 n n 

f (X* + h) — /(X*) = — E E h 'hj 
Therefore, the sign of 


9 2 f 


i=i j=i 


dXidXj 


o < e < i 


X=X*+0h 


will be same as that of 


/(X* + h) - /(X*) 

9 2 / 


EE a -' a 

i=i j = i 


dxj 3 Xj 


X=X*+0h 


Since the second partial derivative of 3 2 /(X)/3x,3x ; - is continuous in the neighborhood 
of X*, 

9 2 f 


dxj 3 Xj 


x=x*+eh 
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will have the same sign as (d 2 f/dxjdxj)\ X = X* for all sufficiently small h. Thus 
/(X* + h) — /(X*) will be positive, and hence X* will be a relative minimum, if 


e = EE^- 

<■=1 j = i 


9 2 f 


dxidx: 


x=x* 


( 2 . 11 ) 


is positive. This quantity Q is a quadratic form and can be written in matrix form as 


Q — h T J h|x=x* 


(2.12) 


where 


r 9 2 / 


_ dXj dxj 

X=X*- 


(2.13) 


is the matrix of second partial derivatives and is called the Hessian matrix of /(X). 

It is known from matrix algebra that the quadratic form of Eq. (2.11) or (2.12) 
will be positive for all h if and only if [J ] is positive definite at X = X*. This means 
that a sufficient condition for the stationary point X* to be a relative minimum is that 
the Hessian matrix evaluated at the same point be positive definite. This completes the 
proof for the minimization case. By proceeding in a similar manner, it can be proved 
that the Hessian matrix will be negative definite if X * is a relative maximum point. 

Note: A matrix A will be positive definite if all its eigenvalues are positive; that 
is, all the values of A that satisfy the determinantal equation 


|A - Al| =0 


(2.14) 


should be positive. Similarly, the matrix [A] will be negative definite if its eigenvalues 
are negative. 

Another test that can be used to find the positive definiteness of a matrix A of 
order n involves evaluation of the determinants 


0n\ 



an 

an 


ai\ 

022 


All 

on 

on 

«21 

022 

022 

«31 

032 

032 



a n 

«12 

0\3 ' 

tt\n 


021 

022 

023 ' 

' ^2 n 

An = 

03 1 

032 

033 ' 

’ ^3 n 


On 1 

On 2 

On 3 ‘ 

‘ &nn 


The matrix A will be positive definite if and only if all the values A\, Ai, A 3 , . . . , A n 
are positive. The matrix A will be negative definite if and only if the sign of Ay is 
(-1 )- 7 for j = 1, 2, . . . , n. If some of the A ; - are positive and the remaining Aj are 
zero, the matrix A will be positive semidefinite. 


Example 2.4 Figure 2.4 shows two frictionless rigid bodies (carts) A and B connected 
by three linear elastic springs having spring constants k \ , kj, and £ 3 . The springs are 
at their natural positions when the applied force P is zero. Find the displacements x\ 
and A ' 2 under the force P by using the principle of minimum potential energy. 
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SOLUTION According to the principle of minimum potential energy, the system will 
be in equilibrium under the load P if the potential energy is a minimum. The potential 
energy of the system is given by 


potential energy (£/) 

= strain energy of springs — work done by external forces 
= [5*2*1 + 5*3 (*2 - *i) 2 + 5*1 *|] - Px 2 


The necessary conditions for the minimum of U are 


— = * 2*1 - * 3 (*2 - * l ) = 0 
OX\ 


- — = h ( x 2 - *1) + *1*2 - p — 0 

3*2 


(Ei) 

(E 2 ) 


The values of x\ and x 2 corresponding to the equilibrium state, obtained by solving 
Eqs. (Ei) and (E2), are given by 


* 


* 

1 


* 


* 

2 


Pk 3 

k\k 2 + k\k 2 + k 2 k 2 
P (k 2 + *3) 
k\k 2 + k\k 2 + k 2 k 2 


The sufficiency conditions for the minimum at (x*, x|) can also be verified by testing 
the positive definiteness of the Hessian matrix of U . The Hessian matrix of U evaluated 

at (x*, x|) is 



r 3 2 u 

3 2 U I 




3 x 2 

3 xi 3 x 2 


k 2 + &3 — *3 

(X*,x*) - 

3 2 U 

3 2 U 


— *3 *1 + *3 


_dx\dx 2 

dxj . 

(xf.xt) 
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The determinants of the square submatrices of J are 


J\ — &2 + ^3 = k .2 + k-j > 0 


Ji 


k .2 + £3 — &3 

— ky k\ -f- k 3 


= k\k 2 + k\k 2 + ^ 2^3 > 0 


since the spring constants are always positive. Thus the matrix J is positive definite 
and hence (x* , x* 2 ) corresponds to the minimum of potential energy. 


2.3.1 Semidefinite Case 

We now consider the problem of determining the sufficient conditions for the case 
when the Hessian matrix of the given function is semidefinite. In the case of a func- 
tion of a single variable, the problem of determining the sufficient conditions for 
the case when the second derivative is zero was resolved quite easily. We simply 
investigated the higher-order derivatives in the Taylor’s series expansion. A simi- 
lar procedure can be followed for functions of n variables. However, the algebra 
becomes quite involved, and hence we rarely investigate the stationary points for suf- 
ficiency in actual practice. The following theorem, analogous to Theorem 2.2, gives 
the sufficiency conditions for the extreme points of a function of several variables. 


Theorem 2.5 Let the partial derivatives of / of all orders up to the order k > 2 be 
continuous in the neighborhood of a stationary point X*, and 

d r f |x=x* = 0, 1 < r < k — 1 

d k f lx=x* # 0 

so that d k f |x=x* is the first nonvanishing higher-order differential of / at X*. If k is 
even, then (i) X* is a relative minimum if d k f |x=x* is positive definite, (ii) X* is a 
relative maximum if d k f |x=x* is negative definite, and (iii) if d k f |x=x* is semidefinite 
(but not definite), no general conclusion can be drawn. On the other hand, if k is odd, 
X* is not an extreme point of /(X). 

Proof : A proof similar to that of Theorem 2.2 can be found in Ref. [2.5]. 

2.3.2 Saddle Point 

In the case of a function of two variables, fix, y), the Hessian matrix may be neither 
positive nor negative definite at a point (x * , y*) at which 

9f = df = o 

dx 3 y 

In such a case, the point (x*, y*) is called a saddle point. The characteristic of a 
saddle point is that it corresponds to a relative minimum or maximum of /(x, y) with 
respect to one variable, say, x (the other variable being fixed at y — y*) and a relative 
maximum or minimum of /(x, y) with respect to the second variable y (the other 
variable being fixed at x*). 
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As an example, consider the function /(x, y) — x 2 — y 2 . For this function. 


9 f 

— = 2x and 

dx 


9/ 

dy 


= -2 y 


These first derivatives are zero at x* — 0 and y* = 0. The Flessian matrix of / at 
(x*, y*) is given by 


Since this matrix is neither positive definite nor negative definite, the point (x* — 0, 
y* = 0) is a saddle point. The function is shown graphically in Fig. 2.5. It can be seen 
that f (x , y*) = fix, 0) has a relative minimum and /(x*, y) — /( 0, y) has a relative 
maximum at the saddle point (x*\ y*). Saddle points may exist for functions of more 
than two variables also. The characteristic of the saddle point stated above still holds 
provided that x and _y are interpreted as vectors in multidimensional cases. 


Example 2.5 Find the extreme points of the function 

f (x i , X 2 ) — + X 2 + 2x f + 4xj + 6 


SOLUTION The necessary conditions for the existence of an extreme point are 


— 3x 2 + 4x\ = x 1 (3xi + 4) 
9x| 


df_ 

dx 2 


= 3 x\ + 8x 2 = x 2 (3x 2 + 8) 


0 

0 


f(x,y) 
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These equations are satisfied at the points 

(0,0), (0,-f), (-f,0), and (— f, — f) 


To find the nature of these extreme points, we have to use the sufficiency conditions. 
The second-order partial derivatives of / are given by 


d 2 f 

7T = 6x i + 4 
dxf 

a 2 / 

— — — — 6 x 2 + 8 

^ v* 


d 2 f 

dx\dX2 


= 0 


The Hessian matrix of / is given by 

. _ 6 xi +4 0 

0 6x2 8 


If J l — |6xi + 4| and J 2 — 
of the extreme point are as given below: 


6 x 1 + 4 0 

0 6x2 -f 8 


, the values of / 1 and Ji and the nature 


Point X 

Value of J\ 

Value of J 2 

Nature of J 

Nature of X 

/(X) 

(0, 0) 

+4 

+32 

Positive definite 

Relative minimum 

6 

(0,-f) 

+4 

-32 

Indefinite 

Saddle point 

418/27 

(-|,0) 

-4 

-32 

Indefinite 

Saddle point 

194/27 

f_4 _Bx 
^ 3 ’ 3 ' 

-4 

+32 

Negative definite 

Relative maximum 

50/3 


2.4 MULTIVARIABLE OPTIMIZATION WITH EQUALITY 
CONSTRAINTS 


In this section we consider the optimization of continuous functions subjected to equal- 
ity constraints: 


where 


Minimize/ = /(X) 
subject to 

gjOt) = 0, j — 1,2, ... ,m 
xi 


(2.16) 


x. 
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Here m is less than or equal to n; otherwise (if m > n ), the problem becomes overdefined 
and, in general, there will be no solution. There are several methods available for the 
solution of this problem. The methods of direct substitution, constrained variation, and 
Lagrange multipliers are discussed in the following sections. 


For a problem with n variables and m equality constraints, it is theoretically possible 
to solve simultaneously the m equality constraints and express any set of in variables 
in terms of the remaining n — m variables. When these expressions are substituted into 
the original objective function, there results a new objective function involving only 
n — m variables. The new objective function is not subjected to any constraint, and 
hence its optimum can be found by using the unconstrained optimization techniques 
discussed in Section 2 . 3 . 

This method of direct substitution, although it appears to be simple in theory, is 
not convenient from a practical point of view. The reason for this is that the con- 
straint equations will be nonlinear for most of practical problems, and often it becomes 
impossible to solve them and express any m variables in terms of the remaining n — m 
variables. However, the method of direct substitution might prove to be very simple 
and direct for solving simpler problems, as shown by the following example. 

Example 2.6 Find the dimensions of a box of largest volume that can be inscribed 
in a sphere of unit radius. 

SOLUTION Let the origin of the Cartesian coordinate system X ] , X2, x$ be at the 
center of the sphere and the sides of the box be 2 jci, 2x2, and 2x3. The volume of the 
box is given by 


Since the corners of the box lie on the surface of the sphere of unit radius, x\, X2, and 
X3 have to satisfy the constraint 


This problem has three design variables and one equality constraint. Hence the 
equality constraint can be used to eliminate any one of the design variables from the 
objective function. If we choose to eliminate X3, Eq. (E 2 ) gives 


2.4.1 Solution by Direct Substitution 


/(*!. X 2 , X 3 ) = 8X!X 2 X 3 


(Hi) 


2 1 2 1 2 1 

Xj -\~ X2 + ^3 — 1 


(E 2 ) 



(e 3 ) 


Thus the objective function becomes 


fix i,x 2 ) = 8 xjx 2 (1 - x\ -xf) 1/2 


(E 4 ) 


which can be maximized as an unconstrained function in two variables. 
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The necessary conditions for the maximum of / give 


df_ 

3xj 


— 8x2 


,20/2 


3X2 


(1 - X, -x 2 ) 

(1 -x? -xf) 1/2 - 


(1-x2-x 2 2)1/2J 


(l-Xf-X 2 ) 1 ^ 


= 0 


(Es) 

(Ee) 


Equations (E 5 ) and (Eg) can be simplified to obtain 

1 — 2xf — x\ — 0 
1 — x 2 — 2 x 2 = 0 

from which it follows that x* = x| = 1/V3 and hence x| = 1/V3. This solution gives 
the maximum volume of the box as 


8 

3^3 


To find whether the solution found corresponds to a maximum or a minimum, 
we apply the sufficiency conditions to f(x\, xj) of Eq. (E 4 ). The second-order partial 
derivatives of / at (x*, x*) are given by 


a 2 / 

3x^ 


32 

7! 


at (x, , x|) 


a 2 / 

ax 2 


32 

7! 


at ( x x|) 


3 2 f 
dx\dx2 


at (x?, x|) 


Since 


3 2 / „ , 3 2 / 3 2 / / 3 2 / \ 2 „ 

— r- < 0 and — r- — - — ( ) > 0 

3x 2 3x 2 3x 2 \ 8 x 18 x 2 / 

the Hessian matrix of / is negative definite at (x*, x|). Hence the point (x*, x|) 
corresponds to the maximum of /. 


2.4.2 Solution by the Method of Constrained Variation 

The basic idea used in the method of constrained variation is to find a closed-form 
expression for the first-order differential of f{df) at all points at which the constraints 
gj ( X ) = 0, / = 1,2,..., m, are satisfied. The desired optimum points are then obtained 
by setting the differential df equal to zero. Before presenting the general method, 
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we indicate its salient features through the following simple problem with n — 2 and 
m — 1 : 


Minimize f{x\,x 2 ) 


(2.17) 


subject to 


g(x\,x 2 ) = 0 (2.18) 

A necessary condition for / to have a minimum at some point ( x *, x|) is that the total 
derivative of /(x i, x 2 ) with respect to x\ must be zero at (x*, x|). By setting the total 
differential of f(x \,x 2 ) equal to zero, we obtain 

df df 

df = -J-dxx + — dx 2 = 0 (2.19) 

3xj dx 2 

Since g(x*, x|) = 0 at the minimum point, any variations dx\ and dx 2 taken about 
the point (x*, x|) are called admissible variations provided that the new point lies on 
the constraint: 


g(x* + dx\, x| + dx 2 ) — 0 (2.20) 

The Taylor’s series expansion of the function in Eq. (2.20) about the point (x*,x|) 
gives 


g(x* + dx i, x| + dx 2 ) 

dg d g 

— g(x*,x 2 ) + ——— (x * , x* ) dx\ + — — - (x * , x 2 ) dx 2 - 0 (2.21) 

3xi 3x 2 

where dx\ and dx 2 are assumed to be small. Since g(x*,x|) = 0, Eq. (2.21) reduces 
to 

dg dg 

dg — dx i H dxn — 0 at (xf,x|) (2.22) 

3xi 3x2 

Thus Eq. (2.22) has to be satisfied by all admissible variations. This is illustrated 
in Fig. 2.6, where PQ indicates the curve at each point of which Eq. (2.18) is sat- 
isfied. If A is taken as the base point (x*,x*), the variations in xj and X 2 leading 
to points B and C are called admissible variations. On the other hand, the varia- 
tions in x\ and X 2 representing point D are not admissible since point D does not 
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lie on the constraint curve, g(x 1 ,^ 2 ) = 0. Thus any set of valuations (dx \ , dxj) that 
does not satisfy Eq. (2.22) leads to points such as D, which do not satisfy constraint 
Eq. (2.18). 

Assuming that dg/dx 2 ^ 0, Eq. (2.22) can be rewritten as 

dx 2 = — O*, xl)dx\ (2.23) 

dg/ 3x 2 


This relation indicates that once the variation in x\(dx\) is chosen arbitrarily, the 
variation in X 2 (dx 2 ) is decided automatically in order to have dx 1 and dx 2 as a set of 
admissible variations. By substituting Eq. (2.23) in Eq. (2.19), we obtain 


df = 



dg/dxi 9 / \ 
dg/dx 2 dx 2 J 


dx 1 = 0 

(-V 1 , *2 ) 


(2.24) 


The expression on the left-hand side is called the constrained variation of /. Note that 
Eq. (2.24) has to be satisfied for all values of dx 1 . Since dx 1 can be chosen arbitrarily, 
Eq. (2.24) leads to 


fdf_ dg_ 

\3xi 3x2 


9/ _3g\ 

3xt 3xi ) 



(2.25) 


Equation (2.25) represents a necessary condition in order to have (x*, x|) as an extreme 
point (minimum or maximum). 


Example 2.7 A beam of uniform rectangular cross section is to be cut from a log 
having a circular cross section of diameter 2a. The beam has to be used as a cantilever 
beam (the length is fixed) to carry a concentrated load at the free end. Find the dimen- 
sions of the beam that correspond to the maximum tensile (bending) stress carrying 
capacity. 


SOLUTION From elementary strength of materials, we know that the tensile stress 
induced in a rectangular beam (er) at any fiber located a distance y from the neutral 
axis is given by 

a M 

y = T 

where M is the bending moment acting and I is the moment of inertia of the cross 
section about the x axis. If the width and depth of the rectangular beam shown in 
Fig. 2.7 are 2x and 2_y, respectively, the maximum tensile stress induced is given by 

_ M My _ 3 M 

“ ~ y ~ i(2.v)(2v) 3 " i V 

Thus for any specified bending moment, the beam is said to have maximum tensile 
stress carrying capacity if the maximum induced stress (er m ax) is a minimum. Hence 
we need to minimize kJxy 1 or maximize Kxy 2 , where k — 3M/4 and K — 1 / k, subject 
to the constraint 

2,2 2 

x + y — a 
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y 

k 



This problem has two variables and one constraint; hence Eq. (2.25) can be applied 
for finding the optimum solution. Since 

f = kx~ l y~ z (E0 

g = x 2 + y 2 - a 2 (E 2 ) 


we have 


9/ 

dx 


-kx 2 y 2 


9/ 

dy 

9g 

dx 

dg 

dy 


= - 2kx~ l y “ 3 
= 2x 
— 2y 


Equation (2.25) gives 

—kx~ 2 y~ 2 (2y) + 2kx~ l y~ 2 (2x) = 0 


at (x*, y*) 


that is, 

>’* = V2x* (E 3 ) 

Thus the beam of maximum tensile stress carrying capacity has a depth of \/2 times 
its breadth. The optimum values of x and y can be obtained from Eqs. (E 3 ) and (E 2 ) 

* a , * r- a 

— —= and y — \/2—= 

73 ' 73 


x 
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Necessary Conditions for a General Problem. The procedure indicated above can 
be generalized to the case of a problem in n variables with m constraints. In this case, 
each constraint equation g ; (X) =0, j = 1,2,..., m, gives rise to a linear equation in 
the variations dxj, i = 1,2 Thus there will be in all m linear equations in n 
variations. Hence any m variations can be expressed in terms of the remaining n — m 
variations. These expressions can be used to express the differential of the objective 
function, df, in terms of the n — m independent variations. By letting the coefficients 
of the independent variations vanish in the equation df — 0, one obtains the necessary 
conditions for the constrained optimum of the given function. These conditions can be 
expressed as [2.6] 


f, gl, 82, •••, gm 

Xk,X 1, X2, X 3 , . . . , X 


df_ 

3/ 

Bf_ 

.. 

3 Xk 

dx\ 

dX2 

dx m 

dgl 

dgl 

dgl 

dg 1 

3 x k 

dx\ 

dX2 

dx m 

dg2 

dg2 

dg2 

dg2 

dx k 

dx\ 

dX2 

dx m 

dgm 

dgm 

dgm 

dgm 

dx k 

dx\ 

dX2 

dx m 


(2.26) 


where k = m + 1, m + 2, ...,«. It is to be noted that the variations of the first m vari- 
ables (dx i, dx 2 , . . . , dx m ) have been expressed in terms of the variations of the remain- 
ing n — m variables (dx m+ \ , dx m+ 2 , . . . , dx n ) in deriving Eqs. (2.26). This implies that 
the following relation is satisfied: 


J { — g2 ’ g '" ) ± 0 (2.27) 

\x u x 2 , ..., X m ) 

The n — m equations given by Eqs. (2.26) represent the necessary conditions for the 
extremum of /(X) under the m equality constraints, g 7 (X) =0, j = 1 , 2 ,..., m. 


Example 2.8 


Minimize /(Y ) = ± (y 2 + y 2 2 + y 2 + y 2 ) (Ej) 

subject to 

gi (Y ) = yi + 2y 2 + 3y 3 + 5y 4 - 10 = 0 (E 2 ) 

£ 2 (Y ) = yi + 2y 2 + 5y 3 + 6y 4 -15=0 (E 3 ) 

SOLUTION This problem can be solved by applying the necessary conditions given 
by Eqs. (2.26). Since n = 4 and m — 2, we have to select two variables as independent 
variables. First we show that any arbitrary set of variables cannot be chosen as indepen- 
dent variables since the remaining (dependent) variables have to satisfy the condition 
of Eq. (2.27). 
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In terms of the notation of our equations, let us take the independent variables as 
xt, — y 3 and X 4 — y 4 so that x\ = yi and X 2 — y 2 

Then the Jacobian of Eq. (2.27) becomes 


gi> 82 
Xl,X 2 


dg 1 dg 1 
dy 1 dy 2 
dg 2 dg 2 
dyi dy 2 


1 2 
1 2 


and hence the necessary conditions of Eqs. (2.26) cannot be applied. 

Next, let us take the independent variables as X 3 = y 2 and X 4 = >>4 so that x\ = yi 
and x 2 = >’3 • Then the Jacobian of Eq. (2.27) becomes 


gl- g 2 

X\,X 2 


dg 1 3gl 
3 vi 3y3 

3g2 3g2 

3 Vi dy 3 


1 3 
1 5 


= 2^0 


and hence the necessary conditions of Eqs. (2.26) can be applied. Equations (2.26) give 
for k = m + 1 =3 


and for k = m 


3/ 

3/ 

3/ 


3/ 

3/ 

3/ 

dx 3 

9xi 

3x 2 


3^2 

3yi 

9^3 

3gl 

9gi 

9gi 


9gi 

3gi 

3gi 

9^3 

9xi 

3x 2 

— 

dy 2 

dyi 

9.V3 

3g2 

3g2 

3g2 


3g2 

3g2 

3g2 

9^3 

9xi 

3X2 


9y2 

3yi 

9.V3 





T 2 Tl J3 





— 

2 

1 3 






2 

1 5 





= 

J 2 (5 

-3) 

- Vi( 





2 y 2 - 

- 4yi 

= 0 

+ 2 = 

n = 

4, 





3/ 

3 / 

9/ 


3/ 

3/ 

3/ 

9x 4 

9xj 

3x 2 


3y 4 

3yi 

9y3 

9gi 

3gi 

9gi 


9gi 

9gi 

9gi 

9x 4 

9xi 

3x 2 

— 

3y 4 

3yi 

9y3 

9g2 

3g2 

3g2 


3g2 

3g2 

3g2 

9x 4 

9xi 

3x 2 


3y 4 

3yi 

9y3 


(E 4 ) 
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V4 y\ T3 


5 

6 


1 3 
1 5 


= J4(5 - 3) - yi (25 - 18) + y 3 (5 - 6) 

= 2 y 4 - 7yi - y 3 = 0 ( E 5 ) 

Equations (E4) and (E5) give the necessary conditions for the minimum or the maxi- 
mum of / as 


yi = i v 2 

y 3 = 2y 4 - 7yi = 2y 4 - \y 2 

When Eqs. (Eg) are substituted, Eqs. (E 2 ) and (£3) take the form 

-8y 2 + 11 y 4 = 10 
— 15y 2 + 16y 4 = 15 


from which the desired optimum solution can be obtained as 


^t — 74 

V* — — — 

?2 - 37 


V* - 155 
- 74 

_ 30 

y 4 — 37 


(Eg) 


Sufficiency Conditions for a General Problem. By eliminating the first m variables, 
using the m equality constraints (this is possible, at least in theory), the objective func- 
tion / can be made to depend only on the remaining variables, x m+ \, x m+2 , . . . , x„. 
Then the Taylor’s series expansion of /, in terms of these variables, about the extreme 
point X* gives 


/ (X* + dX) ~ / (X*) + J2 (^-) dxi 

i=m + 1 ' '8 


1 n n 

r; £ £ 


n-f 


2! ^ \dx, dxi 

i=m + 1 j=m -\- 1 J 


dxj dx: 


(2.28) 


where ( df/dxj) s is used to denote the partial derivative of / with respect to x,- 
(holding all the other variables x m+ \ , x m+2 , ..., x/-i, x ! + \ . x l+2 , ..., x„ constant) 
when xi, x 2 , . . . , x m are allowed to change so that the constraints gj(X* + dX ) = 0, 
j — 1 , 2 ,..., m, are satisfied; the second derivative, (d 2 f/dxidxj) g , is used to denote 
a similar meaning. 
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As an example, consider the problem of minimizing 

/(X) = fix l,X 2 ,X 3 ) 


subject to the only constraint 

gi (X) = x\ + X 2 + x 2 — 8 = 0 


Since n — 3 and m — 1 in this problem, one can think of any of the m variables, 
say x\, to be dependent and the remaining n — m variables, namely xi and X 3 , to be 
independent. Here the constrained partial derivative (df/dx 2 ) g , for example, means 
the rate of change of / with respect to X 2 (holding the other independent variable x 2 
constant) and at the same time allowing x\ to change about X* so as to satisfy the 
constraint gi(X) = 0. In the present case, this means that dx\ has to be chosen to 
satisfy the relation 


gi(X* + dX) ~ gi(X*) + ^-(X*)dx ! + — (X *)dx 2 + ?p-(X*)dx 3 

0X\ dX 2 OX 3 


that is, 


2x* dx 1 + 2x\ dx 2 = 0 


since gi(X*) = 0 at the optimum point and dx 3 = 0 (x 2 is held constant). 

Notice that (df/dxj) g has to be zero for ) = m + 1, m + 2, . . . , n since the re- 
appearing in Eq. (2.28) are all independent. Thus the necessary conditions for the 
existence of constrained optimum at X* can also be expressed as 

= 0, i = m + 1 , m + 2, . . . , n (2.29) 

g 

Of course, with little manipulation, one can show that Eqs. (2.29) are nothing but 
Eqs. (2.26). Further, as in the case of optimization of a multivariable function with no 
constraints, one can see that a sufficient condition for X* to be a constrained relative 
minimum (maximum) is that the quadratic form Q defined by 



Q = 



dxj 


(2.30) 


is positive (negative) for all nonvanishing variations dxj . As in Theorem 2.4, the matrix 


9 2 / 

, dx l+ 1, 


9 2 / 

QXm-\r\ OXm+2 


a 2 / \ 

1 dx n J 


( 9 2 / \ / 9 2 / \ /9V\ 

V dx n 3 x m+ i J g V 3- r » dx m+2 ) g \ 3 x} t ) 


has to be positive (negative) definite to have Q positive (negative) for all choices of 
dxj. It is evident that computation of the constrained derivatives (9 2 //3x, dxj) g is a 
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difficult task and may be prohibitive for problems with more than three constraints. 
Thus the method of constrained variation, although it appears to be simple in theory, is 
very difficult to apply since the necessary conditions themselves involve evaluation of 
determinants of order m + 1 . This is the reason that the method of Lagrange multipliers, 
discussed in the following section, is more commonly used to solve a multivariable 
optimization problem with equality constraints. 

2.4.3 Solution by the M ethod of L agrange M ultipliers 

The basic features of the Lagrange multiplier method is given initially for a simple 
problem of two variables with one constraint. The extension of the method to a general 
problem of n variables with m constraints is given later. 

Problem with Two Variables and One Constraint. Consider the problem 

Minimize f(x\, x 2 ) (2.31) 


subject to 


g(x 1 ,x 2 ) = 0 


For this problem, the necessary condition for the existence of an extreme point at 
X — X* was found in Section 2.4.2 to be 


_ 9 /_ 

dxi 


df/dx 2 dg \ 
dg/dx 2 3xj ) 



By defining a quantity X, called the Lagrange multiplier, as 


( df/dx 2 \ 
\dg/dx 2 ) 




Equation (2.32) can be expressed as 


v +a 2!) 

3xi dx\ ) 



and Eq. (2.33) can be written as 


3^2 3x 2 / 



(2.32) 


(2.33) 


(2.34) 


(2.35) 


In addition, the constraint equation has to be satisfied at the extreme point, that is, 


g(x l, x 2 ) | (**,*!)= 0 (2.36) 

Thus Eqs. (2.34) to (2.36) represent the necessary conditions for the point (x*, x|) to 
be an extreme point. 

Notice that the partial derivative {Hg/dx 2 )\( x * tX *) has to be nonzero to be able 
to define X by Eq. (2.33). This is because the variation dx 2 was expressed in terms 
of dx i in the derivation of Eq. (2.32) [see Eq. (2.23)]. On the other hand, if we 
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choose to express dx\ in terms of dx 2 , we would have obtained the requirement that 
($g/dx\)\( X *' X *) be nonzero to define X. Thus the derivation of the necessary conditions 
by the method of Lagrange multipliers requires that at least one of the partial derivatives 
of g(x 1 , x 2 ) be nonzero at an extreme point. 

The necessary conditions given by Eqs. (2.34) to (2.36) are more commonly gen- 
erated by constructing a function L, known as the Lagrange function, as 

L(x i,x 2 ,X) = f(xi,x 2 )+Xg(xi,x 2 ) (2.37) 


By treating L as a function of the three variables x \ , x 2 , and X, the necessary conditions 
for its extremum are given by 


9 L 9/ dg 

— (xi,x 2 ,X) = —(xi,x 2 )+X—(xux 2 ) = 0 

ox 1 ox 1 

9 L 9/ dg 

- — \X\, x 2 , X) = —(x u x 2 ) + X—(x u x 2 ) = 0 
dx 2 dx 2 9x2 

9L 

— (xj, x 2 , X) = g(x u x 2 ) = 0 
dX 


(2.38) 


Equations (2.38) can be seen to be same as Eqs. (2.34) to (2.36). The sufficiency 
conditions are given later. 


Example 2.9 Find the solution of Example 2.7 using the Lagrange multiplier method: 

Minimize /(x, y) = kx~ l y~ 2 


subject to 

g(x, y ) = x 2 + y 2 - a 2 — 0 


SOLUTION The Lagrange function is 

L(x, y, X) — f{x , y) + Xg(x, y) — kx~ l y~ 2 + X(x 2 -f y 2 - a 2 ) 

The necessary conditions for the minimum of /(x, y) [Eqs. (2.38)] give 

9 L 

— = —kx~ 2 y~ 2 + 2xX — 0 (Ei) 

dx 

3 

— = -2 kx~ x y~ 2 + 2yX = 0 (E 2 ) 

9y 

= * 2 + y 2 - a 2 = 0 (E 3 ) 

0 X 

Equations (Ej ) and (E 2 ) yield 


2X 


k 


x 3 y 2 


2k 

x/ 


from which the relation x* — ( I / \/2)y* can be obtained. This relation, along with 
Eq. (E 3 ), gives the optimum solution as 

* = 4 and y* = V2-^= 

V3 ' V3 
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Necessary Conditions for a General Problem. The equations derived above can be 
extended to the case of a general problem with n variables and m equality constraints: 


Minimize /(X) 


(2.39) 


subject to 


gj(X) = 0, j = 1, 2, m 


The Lagrange function, L, in this case is defined by introducing one Lagrange multiplier 
X j for each constraint gj ( X ) as 

L(x \ , X2, ■ • • , %ti , ^-1 , • 5 A til ) 

— /(X) + Aigi (X) + Aog2(X) + • ' • + Am gm (X ) (2.40) 


By treating L as a function of the « + m unknowns, jq, *2, . . . , x n , X\, X 2 , . . . , A. m , 
the necessary conditions for the extremum of L, which also correspond to the solution 
of the original problem stated in Eq. (2.39), are given by 


dL 

dxj 


dXi + 3r, - 0 ’ 


7 = 1 


3x,- 


i = 1, 2, 


9L 

s; = *' <X) 


7 = 1,2,..., m 


■ , n 


(2.41) 

(2.42) 


Equations (2.41) and (2.42) represent n + m equations in terms of the n + m unknowns, 
Xj and Xj. The solution of Eqs. (2.41) and (2.42) gives 



K1 


' K ' 

II 

* 

X 

X* 

and X* — 

x * 2 


.4. 


.K. 


The vector X* corresponds to the relative constrained minimum of /(X) (sufficient 
conditions are to be verified) while the vector X* provides the sensitivity information, 
as discussed in the next subsection. 

Sufficiency Conditions for a General Problem. A sufficient condition for /(X) to 
have a constrained relative minimum at X* is given by the following theorem. 

Theorem 2.6 Sufficient Condition A sufficient condition for /(X) to have a relative 
minimum at X* is that the quadratic, Q, defined by 

» 43 > 

i = 1 7=1 J 

evaluated at X = X * must be positive definite for all values of r/X for which the 
constraints are satisfied. 
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Proof: The proof is similar to that of Theorem 2.4. 
Notes: 


1. If 


Q 


V V 


d l L 


^ 3x 


■(X*, X*)dxi dxj 


i = 1 7=1 


is negative for all choices of the admissible variations dxi, X* will be a con- 
strained maximum of /(X). 

2 . It has been shown by Hancock [2. 1 ] that a necessary condition for the quadratic 
form Q, defined by Eq. (2.43), to be positive (negative) definite for all admissi- 
ble variations c/X is that each root of the polynomial n , defined by the following 
determinantal equation, be positive (negative): 


where 


L 11 - 2 

L\2 

Ll3 . 

L In 

511 

521 • 

■ • gm 1 

£21 

L 22 - z 

^23 ■ 

L2n 

512 

522 ■ ■ 

■ • 8 m2 

Ln\ 

Ln2 

Ln 3 • 

• • L n n 

z gin 

g2n ■ 

■ ■ gmn 

gll 

gn 

513 • 

gin 

0 

0 . 

.. 0 

#21 

g22 

523 ■ 

g2n 

0 

0 . 

.. 0 

gm\ 

gm 2 

grn3 ■ 

• • §mn 

0 

0 . 

.. 0 



L;; 

d 2 L 

(X*. 

ri 



dxi dxj 


^gi /y*\ 
s “ = fc (X > 


(2.44) 


(2.45) 

(2.46) 


3 . Equation (2.44), on expansion, leads to an (n — m )th-ordcr polynomial in z. If 
some of the roots of this polynomial are positive while the others are negative, 
the point X* is not an extreme point. 

The application of the necessary and sufficient conditions in the Lagrange multiplier 
method is illustrated with the help of the following example. 

Example 2.10 Find the dimensions of a cylindrical tin (with top and bottom) made 
up of sheet metal to maximize its volume such that the total surface area is equal to 
Aq — 24jt. 

SOLUTION If x\ and X 2 denote the radius of the base and length of the tin, respec- 
tively, the problem can be stated as 

Maximize f (x\. xf) = ttx\x 2 
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subject to 


2 ttx\ + 2 tt x\Xn — Aq — 24 n 


The Lagrange function is 


L(x i, X 2 , X) = nx\x 2 + X(2nxj + 2 ttx\X2 — Ao) 


and the necessary conditions for the maximum of / give 

3L 
dx\ 

3 L 
3x 2 
3 L 


= 2nx\X2 + 4ttXxi + 2nXx2 = 0 


= 7TX i + 2^Axi = 0 


dX 


— 2jtXi + 2nx\X2 — Aq — 0 


Equations (Ei) and (Ea) lead to 

X 

that is, 


XlX2 _ 1 

2xi + X 2 2 


XI = 2-^2 

and Eqs. (E 3 ) and (E 4 ) give the desired solution as 




1/2 


1/2 


This gives the maximum value of / as 

/* = 


'aV /2 

54tt / 


(Ei) 

(E 2 ) 

(E 3 ) 


(E 4 ) 


If Aq — 24 n, the optimum solution becomes 

x* — 2 , x\ — 4, X* = — 1, and /* = 16 tt 

To see that this solution really corresponds to the maximum of /, we apply the suffi- 
ciency condition of Eq. (2.44). In this case 


E11 = 


3 Z L 
dxj 


— 2iz Xn + 4nX* = 4n 




L 12 = 


3 Z L 


3xj3x2 


(X*,A.*) 


= L 21 = 2 jix* + 2 : zX* — 2 n 
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3 2 L 

J 22 = 7T 
dx 2 


= 0 


(X*,**) 


gn = 


£12 


dgi 

3gi 

9x 2 


(X*,A*) 


= 47TX* + 27TX2 = 16tT 


Thus Eq. (2.44) becomes 


= 2 jtx* = 4tt 

(XM*) 

4n — z 2 n 16 n 
2 7i 0 — z 4 jt 

16 7i 4 n 0 


= 0 


that is. 

This gives 

Since the value of z is negative, the point (x*, x|) corresponds to the maximum of /. 


llln-z + 192tt 3 = 0 


12 , 


Interpretation of the Lagrange Multipliers. To find the physical meaning of the 
Lagrange multipliers, consider the following optimization problem involving only a 
single equality constraint: 


Minimize /(X) 


(2.47) 


subject to 


g(X) = b or g(X) = b - g(X) = 0 (2.48) 

where b is a constant. The necessary conditions to be satisfied for the solution of the 
problem are 


-^+1-^=0, i — 1,2, ... ,n (2.49) 

OXj dXj 

8 = 0 (2.50) 

Let the solution of Eqs. (2.49) and (2.50) be given by X*, X*, and f* = /(X*). 
Suppose that we want to find the effect of a small relaxation or tightening of the 
constraint on the optimum value of the objective function (i.e., we want to find the 
effect of a small change in b on /*). Lor this we differentiate Eq. (2.48) to obtain 


db — dg — 0 
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or 


;= 1 


db — dg — ^ - dxj 


Equation (2.49) can be rewritten as 


df dg df dg 

— + X — = — k-2- = 0 

dXi dXj dXj dXi 


or 


dg df/dXi 


i — 1,2 , ,n 


dxi X 

Substituting Eq. (2.53) into Eq. (2.51), we obtain 

^ X dxi > 


;=i 


since 


Equation (2.54) gives 


or 


,, C 8/ 


i=l 


. df .* dr 

db db 


df* = X*db 


(2.51) 

(2.52) 

(2.53) 

(2.54) 

(2.55) 

(2.56) 
(2-57) 


Thus X* denotes the sensitivity (or rate of change) of / with respect to b or the marginal 
or incremental change in /* with respect to b at x*. In other words, X* indicates how 
tightly the constraint is binding at the optimum point. Depending on the value of X* 
(positive, negative, or zero), the following physical meaning can be attributed to X*: 

1. X* > 0. In this case, a unit decrease in b is positively valued since one gets a 
smaller minimum value of the objective function /. In fact, the decrease in f* 
will be exactly equal to X* since df = X*(— 1) = — X* < 0. Hence X* may be 
interpreted as the marginal gain (further reduction) in f* due to the tightening 
of the constraint. On the other hand, if b is increased by 1 unit, / will also 
increase to a new optimum level, with the amount of increase in f* being 
determined by the magnitude of X* since df — X*(+ 1) >0. In this case, X* 
may be thought of as the marginal cost (increase) in /* due to the relaxation 
of the constraint. 

2. X* < 0. Here a unit increase in b is positively valued. This means that it 
decreases the optimum value of /. In this case the marginal gain (reduction) 
in f* due to a relaxation of the constraint by 1 unit is determined by the value 
of X* as df* — A*(+l) < 0. If b is decreased by 1 unit, the marginal cost 
(increase) in f* by the tightening of the constraint is df* — / *(— I ) > 0 since, 
in this case, the minimum value of the objective function increases. 
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3 . A* = 0. In this case, any incremental change in b has absolutely no effect on the 
optimum value of / and hence the constraint will not be binding. This means 
that the optimization of / subject to g = 0 leads to the same optimum point 
X* as with the unconstrained optimization of /. 

In economics and operations research, Lagrange multipliers are known as shadow prices 
of the constraints since they indicate the changes in optimal value of the objective 
function per unit change in the right-hand side of the equality constraints. 

Example 2.11 Find the maximum of the function /(X) = 2x\ + x 2 + 1 0 subject to 
g(X) = x \ +2xt = 3 using the Lagrange multiplier method. Also find the effect of 
changing the right-hand side of the constraint on the optimum value of /. 


SOLUTION The Lagrange function is given by 

L(A, A) — 2x\ T X 2 10 T- A(3 — x\ — 2x^t) 
The necessary conditions for the solution of the problem are 


The solution of Eqs. (E 2 ) is 



A = 0 


= 1 — 4Ax-> = 0 

9x 2 


OU o 

— = 3 — x\ — 2x\ — 0 
9A 



[2.97 

[0.13 


A* = 2.0 


The application of the sufficiency condition of Eq. (2.44) yields 


Ell - 2 

L 12 

gn 


E21 

L22 - z 

£12 

= 0 

gn 

g 12 

0 



-z 0 

-1 


-z 0 

-1 

u? 

1 

1 

O 

-4X2 

= 

U2 

1 

OO 

1 

0 

-0.52 

- 1 -4x 2 

0 


-1 -0.52 

0 


0.2704z + 8 + z = 0 

z = -6.2972 


(Ei) 


(E 2 ) 


(E 3 ) 


Hence X* will be a maximum of / with /* = /(X*) = 16.07. 
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One procedure for finding the effect on /* of changes in the value of b (right-hand 
side of the constraint) would be to solve the problem all over with the new value of 
b. Another procedure would involve the use of the value of A*. When the original 
constraint is tightened by 1 unit (i.e., db = — 1), Eq. (2.57) gives 


df* = X*db = 2(— 1) = -2 


Thus the new value of /* is f* + df* — 14.07. On the other hand, if we relax the 
original constraint by 2 units (i.e., db — 2), we obtain 

df* = X*db = 2(+2) = 4 

and hence the new value of f* is f* + df* = 20.07. 


2.5 MULTIVARIABLE OPTIMIZATION WITH INEQUALITY 
CONSTRAINTS 

This section is concerned with the solution of the following problem: 

Minimize /(X) 


subject to 


8j(X)< 0, j — 1, 2, . . . , in (2.58) 

The inequality constraints in Eq. (2.58) can be transformed to equality constraints by 
adding nonnegative slack variables, yj, as 

gj (X) + y 2 j = 0, j — 1,2, . . . , m (2.59) 

where the values of the slack variables are yet unknown. The problem now becomes 

Minimize /(X) 


subject to 


Gj(X,\)=gj(X) + yj=0, j = 1,2, ... ,m (2.60) 

where Y = {yi, yi, . . . , y m } T is the vector of slack variables. 

This problem can be solved conveniently by the method of Lagrange multipliers. 
For this, we construct the Lagrange function L as 

m 

L(X, Y, A) = f(X) + J^XjGj(X, Y) (2.61) 

i = i 


where A = {Aj , A. 2 , . . . , A m } T is the vector of Lagrange multipliers. The stationary 
points of the Lagrange function can be found by solving the following equations 
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(necessary conditions): 


|^(X,Y ,A) = |Ex) 

OXj OXi 


E^<X> = 0, 

j = 1 


— (X , Y ,X) = Gj (X , Y) = gj (X) + y) = 0, 

dXj 

dL 

— (X, Y, X) = 2kjyj — 0, 7 = 1,2,..., m 

dyj 


i — 1,2 ,...,« 
j — 1,2 , ,m 


(2.62) 

(2.63) 

(2.64) 


It can be seen that Eqs. (2.62) to (2.64) represent (n + 2m) equations in the (n + 2m) 
unknowns, X, X, and Y. The solution of Eqs. (2.62) to (2.64) thus gives the optimum 
solution vector, X* *; the Lagrange multiplier vector, A.*; and the slack variable 
vector, Y*. 

Equations (2.63) ensure that the constraints gj(X) £ 0, j = 1,2,..., m, arc satis- 
fied, while Eqs. (2.64) imply that either Xj — 0 or yj = 0. If Xj — 0, it means that the 
jth constraint is inactive^ and hence can be ignored. On the other hand, if yj — 0, it 
means that the constraint is active (gj = 0) at the optimum point. Consider the division 
of the constraints into two subsets, J\ and Jo. where J\ + Jo represent the total set of 
constraints. Let the set J\ indicate the indices of those constraints that are active at the 
optimum point and Jo include the indices of all the inactive constraints. 

Thus for j e J\ C y, = 0 (constraints are active), for j e Jo, Xj — 0 (constraints 
are inactive), and Eqs. (2.62) can be simplified as 


9/ 

dXi 


dXi 


Similarly, Eqs. (2.63) can be written as 


i — 1,2 


^(X) = 0, jeli 
g;(X) + y'j — 0, j e Jo 


(2.65) 


( 2 . 66 ) 

(2.67) 


Equations (2.65) to (2.67) represent n + p + (m — p)=n + m equations in the n + m 
unknowns x, (l = 1,2,..., n), Xj(j e J \ ), and yj (j e J 2 ), where p denotes the number 
of active constraints. 

Assuming that the first p constraints are active, Eqs. (2.65) can be expressed as 


9/ 

dxj 


x 1 ^ + x 2 ^- + ... + x p 

OX; OX; 


dgp 

dXi ’ 


These equations can be written collectively as 


i — 1,2 ,...,« (2.68) 


— V/ — A; Vgi + XoV g 2 + ■ • ■ J-XpVgp 


(2.69) 


^Those constraints that are satisfied with an equality sign, gj = 0, at the optimum point are called the 
active constraints , while those that are satisfied with a strict inequality sign, gj < 0, are termed inactive 
constraints . 

*The symbol e is used to denote the meaning “belongs to” or “element of”. 
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where V/ and Vg ; - are the gradients of the objective function and the y'th constraint, 
respectively: 



'df/3xi 


dgj/dxi 


a// 9 .t 2 


dgj/dx 2 

v/ = 

df/dx n 

and Vgj = ■ 

dgj/dx fl 


Equation (2.69) indicates that the negative of the gradient of the objective function can 
be expressed as a linear combination of the gradients of the active constraints at the 
optimum point. 

Further, we can show that in the case of a minimization problem, the Xj values 
( j e J\ ) have to be positive. For simplicity of illustration, suppose that only two con- 
straints are active (p — 2) at the optimum point. Then Eq. (2.69) reduces to 

-V/ = MVgi + X 2 Vg 2 (2.70) 

Fet S be a feasible direction 1 * * * S * at the optimum point. By premultiplying both sides of 
Eq. (2.70) by S T , we obtain 

-S r V/ = XiS T Vgi + l 2 S r Vg2 (2.71) 

where the superscript T denotes the transpose. Since S is a feasible direction, it should 
satisfy the relations 

S T V gl < 0 

S T Vg 2 < 0 (2.72) 

Thus if A.i >0 and X 2 >0, the quantity S r V/ can be seen always to be positive. As 
V/ indicates the gradient direction, along which the value of the function increases at 
the maximum rate,* S 7 V / represents the component of the increment of / along the 
direction S. If S r V/ > 0, the function value increases as we move along the direction S. 
Hence if X\ and X 2 are positive, we will not be able to find any direction in the feasible 
domain along which the function value can be decreased further. Since the point at 
which Eq. (2.72) is valid is assumed to be optimum, X\ and X 2 have to be positive. 
This reasoning can be extended to cases where there are more than two constraints 
active. By proceeding in a similar manner, one can show that the Xj values have to be 
negative for a maximization problem. 

1 A vector S is called a feasible direction from a point X if at least a small step can be taken along S 
that does not immediately leave the feasible region. Thus for problems with sufficiently smooth constraint 

surfaces, vector S satisfying the relation 

S T W gj < 0 

can be called a feasible direction. On the other hand, if the constraint is either linear or concave, as shown 
in Fig. 2.8b and c, any vector satisfying the relation 

S T Vgj < 0 

can be called a feasible direction. The geometric interpretation of a feasible direction is that the vector 
S makes an obtuse angle with all the constraint normals, except that for the linear or outward-curving 
(concave) constraints, the angle may go to as low as 90°. 

*See Section 6.10.2 for a proof of this statement. 
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Example 2.12 Consider the following optimization problem: 

Minimize/Oci, *2) = x\ + x\ 
subject to 
x\ + 2x2 < 15 
1 < *i < 10; i = l,2 

Derive the conditions to be satisbed at the point Xi = {1. 7} T by the search direction 
S = {si, 52 } T if it is a (a) usable direction, and (b) feasible direction. 

SOLUTION The objective function and the constraints can be stated as 

f(xi,x 2 ) = x\ +x\ 
gi(X) — Xi + 2x2 < 15 
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g 2 (X) = 1 -xi < 0 


g 3 (X) = 1 -x 2 < 0 


g 4 (X) = X! - 10 < 0 


gs(X) - x 2 - 10 < 0 


At the given point X] = {1, 7} T , all the constraints can be seen to be satisfied with gi 
and g 2 being active. The gradients of the objective and active constraint functions at 
point X i = {1, 7 } t are given by 


V/ = 


Vgi = 


Vg2 = 


df_ 

dx\ 

df_ 

dx 2 

dgi 
dxi 
dg l 
dx 2 

dg2 

dxi 

dg2 

3X2 


2xi 

j 2] 

2x 2 

-) 14 j 


x. 



For the search direction S = {.v | , m} 1 , the usability and feasibility conditions can be 
expressed as 

(a) Usability condition: 


S T V/ < 0 or (.s, j 2 ) | 

(b) Feasibility conditions: 

S T Vg[ <0 or f.S| s 2 ) 

S T Vg 2 <0 or (si S 2 ) 


2 

14 


<0 or 2si + 14^2 < 0 


1 

2 

-1 

0 


< 0 or + 2^2 < 0 
<0 or — si < 0 


(Ei) 

(E 2 ) 

(E 3 ) 


Note: Any two numbers for ,V| and s 2 that satisfy the inequality (Ei) will constitute 
a usable direction S. For example, si — 1 and s 2 — — 1 gives the usable direction 
S = {1, — 1} T . This direction can also be seen to be a feasible direction because it 
satisfies the inequalities (E 2 ) and (E 3 ). 
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2.5.1 Kuhn -Tucker Conditions 


As shown above, the conditions to be satisfied at a constrained minimum point, X* *, of 
the problem stated in Eq. (2.58) can be expressed as 


9/ 

3 Xi 


■Y k i— 

^ ] 3 Xi 

j£j 1 


0, 

Xj > 0 , 


i — 1,2 , ... ,n 

j e J\ 


(2.73) 

(2.74) 


These are called Kuhn-Tucker conditions after the mathematicians who derived them 
as the necessary conditions to be satisfied at a relative minimum of /(X) [2.8]. These 
conditions are, in general, not sufficient to ensure a relative minimum. However, there is 
a class of problems, called convex programming problems J for which the Kuhn-Tucker 
conditions are necessary and sufficient for a global minimum. 

If the set of active constraints is not known, the Kuhn-Tucker conditions can be 
stated as follows: 


— + J2 k j— = °’ 

3 Xi ^ ' 3 Xi 

i = 1,2, ... ,n 


7 = 1 



k j8j - 0,* 

j — 1 , 2 , ... ,m 

(2.75) 

gj < o. 

j = 1 , 2 , ... ,m 


>-* 

IV 

o 

j — 1,2, ... , m 



Note that if the problem is one of maximization or if the constraints are of the type 
gj > 0, the 7, ; have to be nonpositive in Eqs. (2.75). On the other hand, if the problem is 
one of maximization with constraints in the form gj > 0, the 7, y have to be nonnegative 
in Eqs. (2.75). 


2.5.2 Constraint Qualification 

When the optimization problem is stated as 

Minimize /(X) 


subject to 

gj(X)<0, j — 1,2, ... ,m 
h k (X) — 0 k = 1, 2, . . . , p 
the Kuhn-Tucker conditions become 

m p 

yf+J2 k J w gj-J2^ vh k = ° 

j = 1 k=l 

Xjgj=0, j — 1,2, ... ,m 


(2.76) 


^See Sections 2.6 and 7.14 for a detailed discussion of convex programming problems. 

*This condition is the same as Eq. (2.64). 
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gj < 0, j — 1,2, ... ,m 

h k = 0, k = 1,2, ..., p (2.77) 

kj > 0, 7 = 1,2,..., m 

where Xj and f k denote the Lagrange multipliers associated with the constraints 
gj < 0 and h k — 0, respectively. Although we found qualitatively that the 
Kuhn-Tucker conditions represent the necessary conditions of optimality, the 
following theorem gives the precise conditions of optimality. 

Theorem 2.7 Let X* be a feasible solution to the problem of Eqs. (2.76). If Vg ; (X*), 
j e J i and S7h k (X*), k — 1 . 2, .... /t, are linearly independent, there exist X* and /?* 
such that (X*, X*, /?*) satisfy Eqs. (2.77). 

Proof: See Ref. [2.11]. 

The requirement that Vg ; (X*), j e J\ and V/i/.(X *), k — 1 . 2, ..../?, be linearly 
independent is called the constraint qualification . If the constraint qualification is vio- 
lated at the optimum point, Eqs. (2.77) may or may not have a solution. It is difficult 
to verify the constraint qualification without knowing X* beforehand. However, the 
constraint qualification is always satisfied for problems having any of the following 
characteristics: 

1. All the inequality and equality constraint functions are linear. 

2. All the inequality constraint functions are convex, all the equality constraint 
functions are linear, and at least one feasible vector X exists that lies strictly 
inside the feasible region, so that 

gj(X) < 0, 7 = 1,2,..., m and h k (X) =0, k — 1,2 , ... ,p 


Example 2.13 Consider the following problem: 

Minixize f{x\, x 2 ) = (x\ — l) 2 + x\ 


subject to 


gi(x 1 ,x 2 ) = *1 -2x 2 <0 

g 2 (Xl,X 2 ) — x\ + 2X2 < 0 


(El) 

(E 2 ) 

(E 3 ) 


Determine whether the constraint qualification and the Kuhn-Tucker conditions are 
satisfied at the optimum point. 


SOLUTION The feasible region and the contours of the objective function are shown 
in Fig. 2.9. It can be seen that the optimum solution is (0, 0). Since g\ and g 2 are both 
active at the optimum point (0, 0), their gradients can be computed as 


3x 2 

-2 


(0, 0) 




{5 


Vgi(X*) = 


and Vg 2 (X*) = 


100 
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x 2 



It is clear that Vgi(X*) and Vg 2 (X*) are not linearly independent. Hence the constraint 
qualification is not satisfied at the optimum point. Noting that 


|2(.n - l) 

= !~ 2 

1 2x2 

(0, 0) 1 ® 


the Kuhn-Tucker conditions can be written, using Eqs. (2.73) and (2.74), as 


—2 + Ai( 0 ) + 7.2(0) — 0 

(E 4 ) 

0 + 7i(— 2) + A 2 (2) = 0 

(E 5 ) 

O 

A 

(E 6 ) 

7-2 > 0 

(E 7 ) 


Since Eq. (E 4 ) is not satisfied and Eq. (E 5 ) can be satisfied for negative values of 
A.; = A . 2 also, the Kuhn-Tucker conditions are not satisfied at the optimum point. 
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Example 2.14 A manufacturing firm producing small refrigerators has entered into 
a contract to supply 50 refrigerators at the end of the first month, 50 at the end of the 
second month, and 50 at the end of the third. The cost of producing x refrigerators 
in any month is given by $(x 2 + 1000). The firm can produce more refrigerators in 
any month and carry them to a subsequent month. However, it costs $20 per unit for 
any refrigerator carried over from one month to the next. Assuming that there is no 
initial inventory, determine the number of refrigerators to be produced in each month 
to minimize the total cost. 

SOLUTION Let x\, X2, and x 3 represent the number of refrigerators produced in the 
first, second, and third month, respectively. The total cost to be minimized is given by 


f{x\,x 2 , x 3 ) — (x^ -(- 1000) -T (x£ 1000) -T (x 2 T - 1000) —L 20(ntr j — 50) 

+ 20 (x 1 x 2 — 100 ) 

— X\ T x~) A Xj T 40.yi -{- 20x2 
The constraints can be stated as 


total cost = production cost + holding cost 


or 


gi(xi, x 2 , x 3 ) = xi - 50 > 0 
g2(xl,X2, x 3 ) = XI + X 2 - 100 > 0 
g 3 (xi, X 2 , x 3 ) = X\ +x 2 + x 3 - 150 > 0 


The Kuhn -Tucker conditions are given by 



that is, 


2xi + 40 T A-i T ^2 -f~ k 3 — 0 


(Ei) 

(E 2 ) 

(E 3 ) 


2x2 T" 2.0 T A.2 T - A. 3 — 0 


2x 3 -f A, 3 — 0 


xjgj = 0, 7 = 1,2,3 


that is, 


A 1 (x 1 — 50) = 0 


(E 4 ) 

(E 5 ) 

(E 6 ) 


^2(xi + X2 — 100 ) — 0 
k 3 (xi + X2 + x 3 — 150) = 0 
gj> 0, j = 1,2, 3 
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that is, 


x x -50 > 0 

(By) 

x\ + X 2 — 100 > 0 

(Eg) 

x\ + X 2 + xt, — 150 > 0 

(Eg) 

A; < 0, j = 1,2,3 


O 

VI 

•< 

(E10) 

^2 S 0 

(En) 

A 3 <0 

(E i2 ) 


The solution of Eqs. (Ei) to (E 12 ) can be found in several ways. We proceed to solve 
these equations by first nothing that either Aj = 0 or x\ — 50 according to Eq. (E 4 ). 
Using this information, we investigate the following cases to identify the optimum 
solution of the problem. 


Case 1: Ai = 0. 

Equations (Ei) to (E 3 ) give 


A 3 

x 3 = -y 

xo = -10- — 
2 

X, = -20 - y 


A3 

2 

A3 

2 


Substituting Eqs. (E 13 ) in Eqs. (E 5 ) and (Eg), we obtain 


(E13) 


A 2 (-130-A 2 -A 3 ) = 0 

A 3 (— 180 — a 2 — |a 3 ) = 0 (E 14 ) 

The four possible solutions of Eqs. (E 14 ) are 
3 

1. A 2 = 0, — 180 — A 2 — —A 3 = 0. These equations, along with Eqs. (E 13 ), yield 
the solution 


A 2 =0, A 3 = —120, x\ = 40, x 2 = 50, x 3 — 60 

This solution satisfies Eqs. (E10) to (Ei 2 ) but violates Eqs. (E7) and (Eg) and 
hence cannot be optimum. 

2. A 3 = 0, — 130 — A 2 — A 3 = 0. The solution of these equations leads to 
A 2 = —130, A 3 = 0, x\ = 45, x 2 = 55, ^3 = 0 
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This solution can be seen to satisfy Eqs. (Eio) to (Em) but violate Eqs. (E7) 
and (Eg). 

3 . A.2 = 0, A3 = 0. Equations (E13) give 

x\ = —20, X 2 — — 10 , X 3 = 0 

This solution satisfies Eqs. (Em) to (Em) but violates the constraints, Eqs. (E7) 
to (Eg). 

4. —130 — A. 2 — A3 = 0, —180 — A. 2 — 7A3 = 0. The solution of these equations 
and Eqs. (E 13) yields 

A. 2 = —30, A3 = —100, x\ — 45, X2 — 55, X3 = 50 
This solution satisfies Eqs. (Em) to (Em) but violates the constraint, Eq. (E7). 


Case 2: jci = 50 . 

In this case, Eqs. (Ej) to (E 3 ) give 
A3 = — 2x3 

Am = —20 — 2 x 2 — A 3 = —20 — 2 x 2 + 2 .X 3 (E 15 ) 

Ai = — 40 — 2xi — A 2 — A 3 = — 120 + 2x2 
Substitution of Eqs. (E 15 ) in Eqs. (Eg) and (Eg) leads to 

(—20 — 2x2 + 2 x3)(xi + X2 — 100) = 0 

(— 2x 3 )(xi + x 2 + x 3 - 150) = 0 (Em) 

Once again, it can be seen that there are four possible solutions to Eqs. (Em), as 
indicated below: 

1. —20 — 2 x 2 + 2 x 3 = 0, x\ + X 2 + X 3 — 150 = 0: The solution of these 

equations yields 

x\ = 50, X 2 = 45, X 3 = 55 

This solution can be seen to violate Eq. (Eg). 

2 . —20 — 2 x 2 + 2 x 3 = 0, — 2 x 3 = 0: These equations lead to the solution 

xi = 50, X 2 = —10, X 3 = 0 

This solution can be seen to violate Eqs. (Eg) and (Eg). 

3 . xi + X 2 — 100 = 0, — 2 x 3 = 0: These equations give 

xi = 50, X 2 = 50, X 3 = 0 


This solution violates the constraint Eq. (Eg). 
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4 . xi + X 2 — 100 — 0, x\ + X 2 + X?, — 150 = 0: The solution of these equations 
yields 


xi = 50, X 2 — 50, X 3 = 50 


This solution can be seen to satisfy all the constraint Eqs. (E 7 ) to (Eg). The 
values of Ai, a 2 , and A 3 corresponding to this solution can be obtained from 
Eqs. (E 15 ) as 

A t = -20, A 2 = -20, A 3 = —100 

Since these values of A, satisfy the requirements [Eqs. (E 10 ) to (E 12 )], this 
solution can be identified as the optimum solution. Thus 

x ; = 50, xf = 50, x 3 * = 50 

2.6 CONVEX PROGRAMMING PROBLEM 

The optimization problem stated in Eq. (2.58) is called a convex programming problem 
if the objective function /(X) and the constraint functions g/(X) are convex. The 
definition and properties of a convex function are given in Appendix A. Suppose that 
/(X) and g ; (X), j — 1.2,..., in, are convex functions. The Lagrange function of 
Eq. (2.61) can be written as 


m 

L(X, Y, A) = /(X) + + T; 2 ] (2.78) 

i = 1 

If A j > 0, then A jgjQt) is convex, and since A jyj — 0 from Eq. (2.64), L(X, Y, A) 
will be a convex function. As shown earlier, a necessary condition for /(X) to be a 
relative minimum at X* is that L(X, Y, A) have a stationary point at X*. However, if 
L (X , Y, A) is a convex function, its derivative vanishes only at one point, which must 
be an absolute minimum of the function /(X). Thus the Kuhn-Tucker conditions are 
both necessary and sufficient for an absolute minimum of /(X) at X*. 

Notes: 

1. If the given optimization problem is known to be a convex programming prob- 
lem, there will be no relative minima or saddle points, and hence the extreme 
point found by applying the Kuhn-Tucker conditions is guaranteed to be an 
absolute minimum of /(X). However, it is often very difficult to ascertain 
whether the objective and constraint functions involved in a practical engineer- 
ing problem are convex. 

2 . The derivation of the Kuhn-Tucker conditions was based on the development 
given for equality constraints in Section 2.4. One of the requirements for these 
conditions was that at least one of the Jacobians composed of the m constraints 
and m of the n + m variables (x \ , x 2 , . . . , x„; yi, y 2 , . . . , y m ) be nonzero. This 
requirement is implied in the derivation of the Kuhn-Tucker conditions. 
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REVIEW QUESTIONS 

2.1 State the necessary and sufficient conditions for the minimum of a function f{x). 

2.2 Under what circumstances can the condition df(x)/dx = 0 not be used to find the mini- 
mum of the function f(x)7 

2.3 Define the rth differential, d r /(X), of a multivariable function /(X). 

2.4 Write the Taylor’s series expansion of a function /(X). 

2.5 State the necessary and sufficient conditions for the maximum of a multivariable function 

/(X). 

2.6 What is a quadratic form? 

2.7 How do you test the positive, negative, or indefiniteness of a square matrix [A]? 

2.8 Define a saddle point and indicate its significance. 

2.9 State the various methods available for solving a multivariable optimization problem with 
equality constraints. 

2.10 State the principle behind the method of constrained variation. 

2.11 What is the Lagrange multiplier method? 
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2.12 What is the significance of Lagrange multipliers? 

2.13 Convert an inequality constrained problem into an equivalent unconstrained problem. 

2.14 State the Kuhn-Tucker conditions. 

2.15 What is an active constraint? 

2.16 Define a usable feasible direction. 

2.17 What is a convex programming problem? What is its significance? 

2.18 Answer whether each of the following quadratic forms is positive definite, negative defi- 
nite, or neither: 

(a) / = x\ - x\ 

(b) / = 4 *i* 2 

(c) / = x 2 + 2x\ 

(d) / = — x\ + 4*1*2 + Ax 2 

(e) / = — x 2 + Ax\X 2 — 9x| + 2 * 1*3 + 8 * 2*3 — 4*| 

2.19 State whether each of the following functions is convex, concave, or neither: 

(a) / = —lx 2 + 8 * + 4 

(b) / = * 2 + 10* + 1 

(c) / = x\ - x\ 

(d) / = -x 2 +4*|*2 

(e) f = e~ x , * > 0 

(f) / = +/x, * > 0 

(g) / = d*2 

(h) / = (*,- l) 2 + 10(* 2 - 2) 2 

2.20 Match the following equations and their characteristics: 

(a) / = 4*i — 3*2 + 2 Relative maximum at (1, 2) 

(b) / = (2*i — 2 ) 2 + (*2 — 2 ) 2 Saddle point at origin 

(c) / = —(*1 — l ) 2 — (*t — 2 ) 2 No minimum 

(d) / = * 1*2 Inflection point at origin 

(e) / = * 3 Relative minimum at (1, 2) 

PROBLEMS 

2.1 A dc generator has an internal resistance R ohms and develops an open-circuit voltage of 
V volts (Fig. 2.10). Find the value of the load resistance r for which the power delivered 
by the generator will be a maximum. 

2.2 Find the maxima and minima, if any, of the function 


(x - l)(x - 3) 3 
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Figure 2.10 Electric generator with load. 


2.3 Find the maxima and minima, if any, of the function 

/(*)=4* 3 -18* 2 + 27*-7 

2.4 The efficiency of a screw jack is given by 

tana 

^ tan(a + 4>) 

where a is the lead angle and 4> is a constant. Prove that the efficiency of the screw jack 
will be maximum when a = 45° — 4>/2 with ?/ max = (1 — sin 0) /( 1 + sin (/>). 

2.5 Find the minimum of the function 


/(*) = 10* 6 - 48* 5 4- 15* 4 + 200* 3 - 120* 2 - 480* + 100 

2.6 Find the angular orientation of a cannon to maximize the range of the projectile. 

2.7 In a submarine telegraph cable the speed of signaling varies as * 2 log(l/*), where * is 
the ratio of the radius of the core to that of the covering. Show that the greatest speed is 
attained when this ratio is 1 : ^ fe . 

2.8 The horsepower generated by a Pelton wheel is proportional to u(V — «), where u is the 
velocity of the wheel, which is variable, and V is the velocity of the jet, which is fixed. 
Show that the efficiency of the Pelton wheel will be maximum when u = V/2. 

2.9 A pipe of length / and diameter D has at one end a nozzle of diameter d through which 
water is discharged from a reservoir. The level of water in the reservoir is maintained at 
a constant value h above the center of nozzle. Find the diameter of the nozzle so that the 
kinetic energy of the jet is a maximum. The kinetic energy of the jet can be expressed 
as 


V 2 ( 5 2gD5/ ' 4 

4 r \D 5 +4f!d 4 


3/2 


where p is the density of water, / the friction coefficient and g the gravitational constant. 

2.10 An electric light is placed directly over the center of a circular plot of lawn 100 m in 
diameter. Assuming that the intensity of light varies directly as the sine of the angle at 
which it strikes an illuminated surface, and inversely as the square of its distance from 
the surface, how high should the light be hung in order that the intensity may be as great 
as possible at the circumference of the plot? 
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2.11 If a crank is at an angle 9 from dead center with 9 = cot, where co is the angular velocity 
and t is time, the distance of the piston from the end of its stroke (*) is given by 

r 2 

x = r(l — cost?) H (1 — cos 29) 

4 1 

where r is the length of the crank and / is the length of the connecting rod. For r = 1 
and / = 5, find (a) the angular position of the crank at which the piston moves with 
maximum velocity, and (b) the distance of the piston from the end of its stroke at that 
instant. 


Determine whether each of the matrices in Problems 2.12-2.14 is positive definite, negative 
definite, or indefinite by finding its eigenvalues. 




3 

1 

- 1 ' 

2.12 

[A] = 

1 

3 

-1 



-1 

-1 

5 



4 

2 

-4 

2.13 

[B] = 

2 

4 

-2 



-4 

-2 

4 



'-1 

-1 

-1 

2.14 

[C] = 

-1 

-2 

-2 



-1 

-2 

-3 


Determine whether each of the matrices in Problems 2.15-2.17 is positive definite, negative 
definite, or indefinite by evaluating the signs of its submatrices. 




3 

1 

-1 

2.15 

[A] = 

1 

3 

-1 



-1 

-1 

5 



4 

2 

-4 

2.16 

[B] = 

2 

4 

-2 



-4 

-2 

4 



'-1 

-1 

-1 

2.17 

[C] = 

-1 

-2 

-2 



-1 

-2 

-3 

2.18 

Express the function 


f(x i, X 2 , * 3 ) = — xf — x\ + 2 * 1*2 — *3 + 6 * 1*3 + 4*i — 5*3 + 2 
in matrix form as 

/(X) = 1X T [AJX + B T X + C 

and determine whether the matrix [A] is positive definite, negative definite, or indefinite. 

2.19 Determine whether the following matrix is positive or negative definite: 

4 -3 O' 

-3 0 4 

0 4 2 


[A] = 
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2.20 Determine whether the following matrix is positive definite: 


[A] = 


-14 3 0 

3-14 
0 4 2 


2.21 The potential energy of the two-bar truss shown in Fig. 2.11 is given by 


f(xi,x 2 ) 




— Px 1 cos 9 


— Px 2 sin 9 


where E is Young’s modulus, A the cross-sectional area of each member, / the span of 
the truss, s the length of each member, h the height of the truss, P the applied load, 
6 the angle at which the load is applied, and x\ and xi are, respectively, the horizontal 
and vertical displacements of the free node. Find the values of x\ and X 2 that minimize 
the potential energy when E = 207 x 10 9 Pa, A = 10 -5 m 2 , / = 1.5 m, h = 4.0 m, 
P = 10 4 N, and 9 = 30°. 

2.22 The profit per acre of a farm is given by 

20xi + 26 x 2 + 4 xiX 2 — 4x 2 — 3x| 


where xi and X 2 denote, respectively, the labor cost and the fertilizer cost. Find the values 
of xi and X 2 to maximize the profit. 

2.23 The temperatures measured at various points inside a heated wall are as follows: 


Distance from the heated surface as 

a percentage of wall thickness, cl 0 25 50 75 100 

Temperature, r(°C) 380 200 100 20 0 


It is decided to approximate this table by a linear equation (graph) of the form t = a + bd 
, where a and b are constants. Find the values of the constants a and b that minimize the 
sum of the squares of all differences between the graph values and the tabulated values. 
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2.24 Find the second-order Taylor’s series approximation of the function 

f(x u x 2 ) = fa - l) 2 e x2 +x, 
at the points (a) (0,0) and (b) (1,1). 

2.25 Find the third-order Taylor’s series approximation of the function 

fix i, X 2 , xi) = x 2 xi + x\e x 3 

at point (1, 0, —2). 

2.26 The volume of sales (/) of a product is found to be a function of the number of newspaper 
advertisements (x) and the number of minutes of television time (y) as 

/ = 12 xy — x 2 — 3y 2 


Each newspaper advertisement or each minute on television costs $1000. How should 
the firm allocate $48,000 between the two advertising media for maximizing its sales? 

2.27 Find the value of x* at which the following function attains its maximum: 

1 


fix) = 


10 V2tt 


-(l/2)[(x-100)/10p 


2.28 It is possible to establish the nature of stationary points of an objective function based 
on its quadratic approximation. For this, consider the quadratic approximation of a 
two-variable function as 


/(X) «a + b T X + \ X T [c] X 

where 

Cll C\2 
C\2 C22_ 

If the eigenvalues of the Hessian matrix, [c], are denoted as Pi and /J 2 , identify the nature 
of the contours of the objective function and the type of stationary point in each of the 
following situations. 

(a) Pi = p 2 \ both positive 

(b) Pi > P2; both positive 

(c) |/3] | = I/J 2 I; Pi and Pi have opposite signs 

(d) Pi > 0, p 2 = 0 


X = 

X1 , b= 

f 1 1 , and [c] = 


x 2 \ 

bi\ 


Plot the contours of each of the following functions and identify the nature of its stationary 
point. 

2.29 f = 2- x 2 -y 2 + 4xy 

2.30 f = 2 + x 2 — y 2 

2.31 f = xy 

2.32 f = x 3 — 3xy 2 
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2.33 Find the admissible and constrained variations at the point X = {0, 4} T for the following 
problem: 

Minimize / = x\ + fe — l) 2 

subject to 

— 2.vf + xo = 4 

2.34 Find the diameter of an open cylindrical can that will have the maximum volume for a 
given surface area, S. 

2.35 A rectangular beam is to be cut from a circular log of radius r . Find the cross-sectional 
dimensions of the beam to (a) maximize the cross-sectional area of the beam, and (b) 
maximize the perimeter of the beam section. 

2.36 Find the dimensions of a straight beam of circular cross section that can be cut from a 
conical log of height h and base radius r to maximize the volume of the beam. 

2.37 The deflection of a rectangular beam is inversely proportional to the width and the cube 
of depth. Find the cross-sectional dimensions of a beam, which corresponds to minimum 
deflection, that can be cut from a cylindrical log of radius r. 

2.38 A rectangular box of height a and width b is placed adjacent to a wall (Fig. 2.12). Find 
the length of the shortest ladder that can be made to lean against the wall. 

2.39 Show that the right circular cylinder of given surface (including the ends) and maximum 
volume is such that its height is equal to the diameter of the base. 

2.40 Find the dimensions of a closed cylindrical soft drink can that can hold soft drink of 
volume V for which the surface area (including the top and bottom) is a minimum. 

2.41 An open rectangular box is to be manufactured from a given amount of sheet metal 
(area S ). Find the dimensions of the box to maximize the volume. 


Ladder 

\, 






** 0 



Figure 2.12 Ladder against a wall. 
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2.42 Find the dimensions of an open rectangular box of volume V for which the amount of 
material required for manufacture (surface area) is a minimum. 

2.43 A rectangular sheet of metal with sides a and b has four equal square portions (of side d) 
removed at the comers, and the sides are then turned up so as to form an open rectangular 
box. Find the depth of the box that maximizes the volume. 

2.44 Show that the cone of the greatest volume that can be inscribed in a given sphere has 
an altitude equal to two-thirds of the diameter of the sphere. Also prove that the curved 
surface of the cone is a maximum for the same value of the altitude. 

2.45 Prove Theorem 2.6. 

2.46 A log of length / is in the form of a frustum of a cone whose ends have radii a and 
b(a > b). It is required to cut from it a beam of uniform square section. Prove that the 
beam of greatest volume that can be cut has a length of a//[3(a — b)]. 

2.47 It has been decided to leave a margin of 30 mm at the top and 20 mm each at the left 
side, right side, and the bottom on the printed page of a book. If the area of the page is 
specified as 5 x 10 4 mm 2 , determine the dimensions of a page that provide the largest 
printed area. 

2.48 Minimize / = 9 — 8 xi — 6 x 2 — 4 x 3 + 2x 2 

T 2xl T x 3 + 2x i a ' 2 T 2x | X 3 


subject to 


x\ + x i + 2 x 3 = 3 


by (a) direct substitution, (b) constrained variation, and (C) Lagrange multiplier method. 

2.49 Minimize /(X) = 2(x 2 + x\ + x 2 ) 

subject to 

gi(X) = jcj - x 2 = 0 
g 2 (X) = xi + x 2 + x 3 - 1 = 0 


by (a) direct substitution, (b) constrained variation, and (c) Lagrange multiplier method. 

2.50 Find the values of x, y, and z that maximize the function 

6 xyz 


f {x,y, z) = 


x + 2 y + 2 z 


when x, y, and z are restricted by the relation xyz = 16. 

2.51 A tent on a square base of side 2 a consists of four vertical sides of height b surmounted 
by a regular pyramid of height b. If the volume enclosed by the tent is V, show that the 
area of canvas in the tent can be expressed as 

2V 8 ah r-z t 

h 4 ny h 2 + a 2 

a 3 

Also show that the least area of the canvas corresponding to a given volume V, if a and 
h can both vary, is given by 

V5h 


a 


1 


and h = 2 b 


Problems 113 


2.52 A department store plans to construct a one-story building with a rectangular planform. 
The building is required to have a floor area of 22,500 ft 2 and a height of 18 ft. It is 
proposed to use brick walls on three sides and a glass wall on the fourth side. Find the 
dimensions of the building to minimize the cost of construction of the walls and the roof 
assuming that the glass wall costs twice as much as that of the brick wall and the roof 
costs three times as much as that of the brick wall per unit area. 

2.53 Find the dimensions of the rectangular building described in Problem 2.52 to minimize 
the heat loss, assuming that the relative heat losses per unit surface area for the roof, 
brick wall, glass wall, and floor are in the proportion 4:2:5: 1 . 

2.54 A funnel, in the form of a right circular cone, is to be constructed from a sheet metal. 
Find the dimensions of the funnel for minimum lateral surface area when the volume of 
the funnel is specified as 200 in 3 . 

2.55 Find the effect on /* when the value of Aq is changed to (a) 25n and (b) 22jt in 
Example 2.10 using the property of the Lagrange multiplier. 

2.56 (a) Find the dimensions of a rectangular box of volume V = 1000 in 3 for which the 

total length of the 12 edges is a minimum using the Lagrange multiplier method. 

(b) Find the change in the dimensions of the box when the volume is changed to 
1200 in 3 by using the value of X* found in part (a). 

(c) Compare the solution found in part (b) with the exact solution. 


2.57 Find the effect on f* of changing the constraint to (a) x + X2 + 2^3 = 4 and (b) x + xi + 
2x3 = 2 in Problem 2.48. Use the physical meaning of Lagrange multiplier in finding the 
solution. 


2.58 A real estate company wants to construct a multistory apartment building on a 
500 x500-ft lot. It has been decided to have a total floor space of 8 x 10 5 ft 2 . The height 
of each story is required to be 12 ft, the maximum height of the building is to be restricted 
to 75 ft, and the parking area is required to be at least 10 % of the total floor area accord- 
ing to the city zoning rules. If the cost of the building is estimated at $(500, 000b + 
2000F + 500P), where h is the height in feet, F is the floor area in square feet, and P 
is the parking area in square feet. Find the minimum cost design of the building. 


2.59 The Brinell hardness test is used to measure the indentation hardness of materials. It 
involves penetration of an indenter, in the form of a ball of diameter D (mm), under a 
load P (kg f ), as shown in Fig. 2.13 a. The Brinell hardness number (BHN) is defined as 


P 2 P 

BHN = — = 

A ttD(D - V D 2 - d 1 ) 


(1) 


where A (in mm 2 ) is the spherical surface area and d (in mm) is the diameter of the 
crater or indentation formed. The diameter d and the depth h of indentation are related 
by (Fig. 2.13b) 


d = 2y/h (D - h) 


(2) 


It is desired to find the size of indentation, in terms of the values of d and h, when a 
tungsten carbide ball indenter of diameter 10 mm is used under a load of P = 3000 kgf 
on a stainless steel test specimen of BHN 1250. Find the values of d and h by formulating 
and solving the problem as an unconstrained minimization problem. 


Hint: Consider the objective function as the sum of squares of the equations implied by 
Eqs. (1) and (2). 
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Figure 2.13 Brinell hardness test. 


2.60 A manufacturer produces small refrigerators at a cost of $60 per unit and sells them to 
a retailer in a lot consisting of a minimum of 100 units. The selling price is set at $80 
per unit if the retailer buys 100 units at a time. If the retailer buys more than 100 units 
at a time, the manufacturer agrees to reduce the price of all refrigerators by 10 cents for 
each unit bought over 100 units. Determine the number of units to be sold to the retailer 
to maximize the profit of the manufacturer. 

2.61 Consider the following problem: 

Minimize/ = (x\ — 2) 2 + ( xo — l) 2 


subject to 

2 > X\ + X2 

X 2 > x\ 

Using Kuhn-Tucker conditions, find which of the following vectors are local minima: 



2.62 Using Kuhn-Tucker conditions, find the value(s) of / for which the point x* = 1, x* = 2 
will be optimal to the problem: 

Maximize f(x i , X 2 ) = 2xi + /Jx 2 


subject to 


gi(x\,x 2 ) = x 2 +xf - 5 < 0 
g 2 (xi,x 2 ) =xi -x 2 -2 < 0 


Verify your result using a graphical procedure. 
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2.63 Consider the following optimization problem: 

Maximize / = —x\ — xn 


subject to 

xf + X 2 > 2 
4 < x\ + 3x2 
x\ + x\ < 30 


(a) Find whether the design vector X = { 1 , 1 } T satisfies the Kuhn-Tucker conditions for 
a constrained optimum. 

(b) What are the values of the Lagrange multipliers at the given design vector? 

2.64 Consider the following problem: 

Maximize /(X ) = xf + x\ + x\ 

subject to 

Xl + X2 + X 3 > 5 

2 — X 2 X 3 < 0 

xi > 0 , X2 > 0 , X3 > 2 

Determine whether the Kuhn-Tucker conditions are satisfied at the following points: 



3 

2 


4 

3 

f - 1 

Xi = 

3 

2 

• , X 2 = 

2 

3 

, X 3 = 1 


.2. 


.3. 

l 2 l 


2.65 Find a usable and feasible direction S at (a) X 1 = {— 1. 5} T and (b) X 2 = {2, 3} for the 
following problem: 

Minimize / (X) = (*1 — l) 2 + (X 2 — 5) 2 

subject to 

gi(X) = -x\ +x 2 -4 < 0 
giiX) = —(x\ — 2) 2 + x 2 — 3 < 0 

2.66 Consider the following problem: 

Maximize / = xf — jc 2 
26 > xf + x\ 

X\ + X 2 > 6 

x\ > 0 


subject to 
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Determine whether the following search direction is usable, feasible, or both at the design 
vector X = {^}: 



2.67 Consider the following problem: 

Minimize f = x\ — 6 x 2 + 1 Ijci + x 3 


subject to 

Xi + x 2 — Xj < 0 

4 — x 2 — x\ — x\ < 0 
Xi >0, i = 1, 2, 3, X 3 < 5 


Determine whether the following vector represents an optimum solution: 


X = 


0 

V2 

V2 


2.68 


Minimize / = x\ + 2x\ + 3xj 


subject to the constraints 


gi=x\-X 2 - 2 x 3 < 12 
g 2 = x\ + 2x2 — 3x3 < 8 


using Kuhn-Tucker conditions. 

2-69 Minimize /(x 1 , X 2 ) = (xi — l ) 2 + (X 2 — 5 ) 2 


subject to 


—x 2 + X 2 < 4 
-(xi - 2) 2 +x 2 < 3 


by (a) the graphical method and (b) Kuhn-Tucker conditions. 
2.70 Maximize / = 8 x 1 + 4 x 2 + X 1 X 2 — x 2 — x? 

subject to 

2xi + 3x2 5 24 
—5xi + 12x2 < 24 
X 2 < 5 


by applying Kuhn-Tucker conditions. 
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2.71 Consider the following problem: 

Maximize /(x) = (jc — 1 ) 2 

subject to 

— 2 < x < 4 

Determine whether the constraint qualification and Kuhn-Tucker conditions are satisfied 
at the optimum point. 

2.72 Consider the following problem: 

Minimize / = (xi — l) 2 + — l) 2 

subject to 

2x2 — (1 — xi) 3 < 0 
xi > 0 
X2 > 0 

Determine whether the constraint qualification and the Kuhn-Tucker conditions are sat- 
isfied at the optimum point. 

2.73 Verify whether the following problem is convex: 

Minimize /(X) = — 4xi + xf — 2xiX 2 + 2xf 

subject to 

2xi + X 2 < 6 
xi — 4x2 < 0 
XI > 0, X2 > 0 


2.74 Check the convexity of the following problems. 

(a) Minimize /(X) = 2xi + 3x2 — Xj — 2x\ 
subject to 

xi + 3x2 < 6 
5xi + 2x2 < 10 
Xi > 0, X2 > 0 

(b) Minimize /(X) = 9x\ — 18xiX2 + 13xi — 4 
subject to 

xj+ x 2+ 2xi > 16 

2.75 Identify the optimum point among the given design vectors, X | , Xt, and X 3 , by applying 
the Kuhn-Tlucker conditions to the following problem: 


Minimize /(X) = 100(x2 — xf) 2 + (1 — x\) 2 
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subject to 

x\— x\ > 0 
x\ — X 2 > 0 


X 2 <1 



2.76 Consider the following optimization problem: 

Minimize / = — xf — jtf + JC 1 X 2 + 7jci + 4jc2 

subject to 

2a'i + 3^2 < 24 
— 5xj + 12x2 < 24 
XI >0, X2 > 0, X2 < 4 

Find a usable feasible direction at each of the following design vectors: 
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Linear Programming I: 
Simplex M ethod 


3.1 INTRODUCTION 

Linear programming is an optimization method applicable for the solution of prob- 
lems in which the objective function and the constraints appear as linear functions 
of the decision variables. The constraint equations in a linear programming problem 
may be in the form of equalities or inequalities. The linear programming type of opti- 
mization problem was first recognized in the 1930s by economists while developing 
methods for the optimal allocation of resources. During World War II the U.S. Air 
Force sought more effective procedures of allocating resources and turned to linear 
programming. George B. Dantzig, who was a member of the Air Force group, for- 
mulated the general linear programming problem and devised the simplex method of 
solution in 1947. This has become a significant step in bringing linear programming into 
wider use. Afterward, much progress was made in the theoretical development and in 
the practical applications of linear programming. Among all the works, the theoretical 
contributions made by Kuhn and Tucker had a major impact in the development of the 
duality theory in LP. The works of Charnes and Cooper were responsible for industrial 
applications of LP. 

Linear programming is considered a revolutionary development that permits us to 
make optimal decisions in complex situations. At least four Nobel Prizes were awarded 
for contributions related to linear programming. For example, when the Nobel Prize 
in Economics was awarded in 1975 jointly to L. V. Kantorovich of the former Soviet 
Union and T. C. Koopmans of the United States, the citation for the prize mentioned 
their contributions on the application of LP to the economic problem of allocating 
resources [3.14]. George Dantzig, the inventor of LP, was awarded the National Medal 
of Science by President Gerald Ford in 1976. 

Although several other methods have been developed over the years for solving LP 
problems, the simplex method continues to be the most efficient and popular method for 
solving general LP problems. Among other methods, Karmarkar’ s method, developed in 
1984, has been shown to be up to 50 times as fast as the simplex algorithm of Dantzig. In 
this chapter we present the theory, development, and applications of the simplex method 
for solving LP problems. Additional topics, such as the revised simplex method, duality 
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theory, decomposition method, postoptimality analysis, and Karmarkar’s method, are 
considered in Chapter 4. 


3.2 APPLICATIONS OF LINEAR PROGRAMMING 

The number of applications of linear programming has been so large that it is not 
possible to describe all of them here. Only the early applications are mentioned here 
and the exercises at the end of this chapter give additional example applications of 
linear programming. One of the early industrial applications of linear programming 
was made in the petroleum refineries. In general, an oil refinery has a choice of buying 
crude oil from several different sources with differing compositions and at differing 
prices. It can manufacture different products, such as aviation fuel, diesel fuel, and 
gasoline, in varying quantities. The constraints may be due to the restrictions on the 
quantity of the crude oil available from a particular source, the capacity of the refinery 
to produce a particular product, and so on. A mix of the purchased crude oil and the 
manufactured products is sought that gives the maximum profit. 

The optimal production plan in a manufacturing firm can also be decided using 
linear programming. Since the sales of a firm fluctuate, the company can have various 
options. It can build up an inventory of the manufactured products to carry it through 
the period of peak sales, but this involves an inventory holding cost. It can also pay 
overtime rates to achieve higher production during periods of higher demand. Finally, 
the firm need not meet the extra sales demand during the peak sales period, thus losing 
a potential profit. Linear programming can take into account the various cost and loss 
factors and arrive at the most profitable production plan. 

In the food-processing industry, linear programming has been used to determine 
the optimal shipping plan for the distribution of a particular product from different 
manufacturing plants to various warehouses. In the iron and steel industry, linear pro- 
gramming is used to decide the types of products to be made in their rolling mills to 
maximize the profit. Metalworking industries use linear programming for shop loading 
and for determining the choice between producing and buying a part. Paper mills use 
it to decrease the amount of trim losses. The optimal routing of messages in a commu- 
nication network and the routing of aircraft and ships can also be decided using linear 
programming. 

Linear programming has also been applied to formulate and solve several types 
of engineering design problems, such as the plastic design of frame structures, as 
illustrated in the following example. 

Example 3.1 In the limit design of steel frames, it is assumed that plastic hinges 
will be developed at points with peak moments. When a sufficient number of hinges 
develop, the structure becomes an unstable system referred to as a collapse mechanism . 
Thus a design will be safe if the energy-absorbing capacity of the frame ( JJ ) is greater 
than the energy imparted by the externally applied loads ( E ) in each of the deformed 
shapes as indicated by the various collapse mechanisms [3.9]. 

For the rigid frame shown in Fig. 3.1, plastic moments may develop at the points of 
peak moments (numbered 1 through 7 in Fig. 3.1). Four possible collapse mechanisms 
are shown in Fig. 3.2 for this frame. Assuming that the weight is a linear function 
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Figure 3.1 Rigid frame. 



E = P\ 8 = 248 
U = 4 M c 9 


E = P Z 8= 10H 
U = 4 M h g 




E -P\ + P 2 8 2 = 34W E = P\8\ = 24(1 

U = 4 M b 9 + 2M C 0 U = 2M ft fl + 2M e 6> 


Figure 3.2 Collapse mechanisms of the frame. /V/;,, moment carrying capacity of beam; M c , 
moment carrying capacity of column [3.9]. 


of the plastic moment capacities, find the values of the ultimate moment capacities 
M\ t, and M c for minimum weight. Assume that the two columns are identical and that 
Pj = 3, P 2 = 1, h = 8, and 1 = 10. 

SOLUTION The objective function can be expressed as 

f(Mb , M c ) — weight of beam + weight of columns 


= a(2lM h +2hM c ) 
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where a is a constant indicating the weight per unit length of the member with a 
unit plastic moment capacity. Since a constant multiplication factor does not affect the 
result, / can be taken as 


f = 2lM b + 2hM c = 20M b +\6M c (E, ) 

The constraints {U > E) from the four collapse mechanisms can be expressed as 

M c > 6 
M h > 2.5 
2 M b + M C >\1 

M b + M c > 12 (E 2 ) 


3.3 STANDARD FORM OF A LINEAR PROGRAMMING PROBLEM 

The general linear programming problem can be stated in the following standard 
forms: 

Scalar Form 


Minimize f(x i, x 2 , . . . , x n ) = cixi + c 2 x 2 + ■ • • + c„ x n (3.1a) 


subject to the constraints 

a n xi+a xl X2-\ \-au,Xn = b\ 

a 2 \x\+ 022 X 2 H b a 2n x n = b 2 

Om\x\ T a m2 x 2 + ■ ■ ■ + a mn x n — b m 

xi > 0 

X2 > o 

x„ > 0 


(3.2a) 


(3.3a) 


where cy, bj, and a,y (i = 1 . 2 ,..., m; j — 1.2, ... , n) are known constants, and xj 
are the decision variables. 


Matrix Form 

Minimize /(X) = C T X (3. 1Z?) 


subject to the constraints 


aX = b 
X > 0 


(3.2 b) 
(3.3 b) 
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where 
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The characteristics of a linear programming problem, stated in standard form, are 

1. The objective function is of the minimization type. 

2 . All the constraints are of the equality type. 

3 . All the decision variables are nonnegative. 

It is now shown that any linear programming problem can be expressed in standard 
form by using the following transformations. 

1. The maximization of a function f(x\, * 2 , . . . , x n ) is equivalent to the minimiza- 
tion of the negative of the same function. For example, the objective function 

minimize / = c\x\ + C 2 X 2 + • • • + c n x n 

is equivalent to 

maximize f' — — / — —c\x \ — cjxj — ■ ■ • — c n x„ 

Consequently, the objective function can be stated in the minimization form in 
any linear programming problem. 

2 . In most engineering optimization problems, the decision variables represent 
some physical dimensions, and hence the variables xj will be nonnegative. 
However, a variable may be unrestricted in sign in some problems. In such 
cases, an unrestricted variable (which can take a positive, negative, or zero 
value) can be written as the difference of two nonnegative variables. Thus if xj 
is unrestricted in sign, it can be written as Xj = x'- — x", where 

Xj > 0 and x" > 0 

It can be seen that Xj will be negative, zero, or positive, depending on whether 
Xj is greater than, equal to, or less than jc' . 

3 . If a constraint appears in the form of a “less than or equal to” type of 
inequality as 

a*i*t + a k2 x 2 -I h a kn x n < b k 

it can be converted into the equality form by adding a nonnegative slack variable 
x n+ \ as follows: 


ci k \X\ + a k 2X2 + ■ ■ • + a kn x n + x n+ \ — b k 
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Similarly, if the constraint is in the form of a “greater than or equal to” type of 
inequality as 

a*t*i + a k 2 x 2 h ak n x n > b k 

it can be converted into the equality form by subtracting a variable as 


®k\X\ ~t~ 0-klX2 T ' ' ' ~f" Q-knXn Xn + 1 — bk 


where x n+ \ is a nonnegative variable known as a surplus variable. 

It can be seen that there are m equations in n decision variables in a linear pro- 
gramming problem. We can assume that m < n; for if m > n, there would be m — n 
redundant equations that could be eliminated. The case n — m is of no interest, for then 
there is either a unique solution X that satisfies Eqs. (3.2) and (3.3) (in which case there 
can be no optimization) or no solution, in which case the constraints are inconsistent. 
The case m < n corresponds to an underdetermined set of linear equations, which, if 
they have one solution, have an infinite number of solutions. The problem of linear 
programming is to find one of these solutions that satisfies Eqs. (3.2) and (3.3) and 
yields the minimum of /. 


3.4 GEOMETRY OF LINEAR PROGRAMMING PROBLEMS 

A linear programming problem with only two variables presents a simple case for which 
the solution can be obtained by using a rather elementary graphical method. Apart 
from the solution, the graphical method gives a physical picture of certain geometrical 
characteristics of linear programming problems. The following example is considered 
to illustrate the graphical method of solution. 

Example 3.2 A manufacturing firm produces two machine parts using lathes, milling 
machines, and grinding machines. The different machining times required for each part, 
the machining times available on different machines, and the profit on each machine 
part are given in the following table. 


Type of machine 

Machining time required (min) 
Machine part I Machine part II 

Maximum time available 
per week (min) 

Lathes 

10 

5 

2500 

Milling machines 

4 

10 

2000 

Grinding machines 

1 

1.5 

450 

Profit per unit 

$50 

$100 



Determine the number of parts I and II to be manufactured per week to maximize the 
profit. 

SOLUTION Let the number of machine parts I and II manufactured per week be 
denoted by x and y, respectively. The constraints due to the maximum time limitations 
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on the various machines are given by 

lOx + 5y 
4x + lOy 
x + 1.5y 


< 2500 

(Ei) 

< 2000 

(E 2 ) 

< 450 

(e 3 ) 


Since the variables x and y cannot take negative values, we have 

jc > 0 
y > 0 


(E 4 ) 


The total profit is given by 


f(x, y) = 50x + lOOy (E 5 ) 

Thus the problem is to determine the nonnegative values of x and y that satisfy the 
constraints stated in Eqs. (Ei) to (E 3 ) and maximize the objective function given by 
Eq. (E 5 ). The inequalities (Ei) to (E 4 ) can be plotted in the xy plane and the feasible 
region identified as shown in Fig. 3.3 Our objective is to find at least one point out of the 
infinite points in the shaded region of Fig. 3.3 that maximizes the profit function (E 5 ). 
The contours of the objective function, /, are defined by the linear equation 

50x + lOOy — k — constant 

As k is varied, the objective function line is moved parallel to itself. The maximum 
value of / is the largest k whose objective function line has at least one point in 
common with the feasible region. Such a point can be identified as point G in Fig. 3.4. 
The optimum solution corresponds to a value of x* — 187.5, y* = 125.0 and a profit 
of $21,875.00. 


y 



Figure 3.3 Feasible region given by Eqs. (Ej ) to (E 4 ). 
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y 



Figure 3.4 Contours of objective function. 


In some cases, the optimum solution may not be unique. For example, if the 
profit rates for the machine parts I and II are $40 and $100 instead of $50 and $100, 
respectively, the contours of the profit function will be parallel to side CG of the 
feasible region as shown in Fig. 3.5. In this case, line P"Q", which coincides with the 
boundary line CG, will correspond to the maximum (feasible) profit. Thus there is no 
unique optimal solution to the problem and any point between C and G on line P" Q" 


y 
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can be taken as an optimum solution with a profit value of $20,000. There are three 
other possibilities. In some problems, the feasible region may not be a closed convex 
polygon. In such a case, it may happen that the profit level can be increased to an 
infinitely large value without leaving the feasible region, as shown in Fig. 3.6. In this 
case the solution of the linear programming problem is said to be unbounded. On the 
other extreme, the constraint set may be empty in some problems. This could be due 
to the inconsistency of the constraints; or, sometimes, even though the constraints may 
be consistent, no point satisfying the constraints may also satisfy the nonnegativity 
restrictions. The last possible case is when the feasible region consists of a single 
point. This can occur only if the number of constraints is at least equal to the number 
of variables. A problem of this kind is of no interest to us since there is only one 
feasible point and there is nothing to be optimized. 

Thus a linear programming problem may have (1) a unique and finite optimum 
solution, (2) an infinite number of optimal solutions, (3) an unbounded solution, (4) no 
solution, or (5) a unique feasible point. Assuming that the linear programming problem 
is properly formulated, the following general geometrical characteristics can be noted 
from the graphical solution: 

1. The feasible region is a convex polygon. 1 ' 

2. The optimum value occurs at an extreme point or vertex of the feasible region. 

3.5 DEFINITIONS AND THEOREMS 

The geometrical characteristics of a linear programming problem stated in Section 3.4 
can be proved mathematically. Some of the more powerful methods of solving linear 
programming problems take advantage of these characteristics. The terminology used 
in linear programming and some of the important theorems are presented in this section. 


convex polygon consists of a set of points having the property that the line segment joining any two 
points in the set is entirely in the convex set. In problems having more than two decision variables, the 
feasible region is called a convex polyhedron , which is defined in the next section. 
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Definitions 

1. Point in n-dimensional space. A point X in an /(-dimensional space is char- 
acterized by an ordered set of n values or coordinates (xi , X 2 , .... x n ) . The 
coordinates of X are also called the components of X . 

2. Line segment in n dimensions (L). If the coordinates of two points A and B 
are given by xj 1 ’ and x| 2) (/' = 1,2 ,...,//), the line segment (L) joining these 
points is the collection of points X (a) whose coordinates are given by xj = 
ax|. 1 1 + (1 — A.)xj 2 \ j — 1, 2, . . . , n, with 0 < X < 1. 

Thus 

L = {X |X = AX (1) + (1 - A.)X (2) } (3.4) 

In one dimension, for example, it is easy to see that the definition is in accor- 
dance with out experience (Fig. 3.7): 

x (2) - x(X) = X[x (2) - x (1) ], 0 < k < 1 (3.5) 

whence 

x(k) = Ax (1) + (1 - A)x ( 2) , 0<k<l (3.6) 

3. Hyperplane. In //-dimensional space, the set of points whose coordinates satisfy 
a linear equation 


aixi + ■ • • + a n x„ = a T X = b (3.7) 

is called a hyperplane. A hyperplane, H , is represented as 

H(a.b) = {X |a T X = b} (3.8) 

A hyperplane has n — 1 dimensions in an //-dimensional space. For example, 
in three-dimensional space it is a plane, and in two-dimensional space it is a 
line. The set of points whose coordinates satisfy a linear inequality like a\x\ + 
• ■ ■ + a„x„ < b is called a closed half-space , closed due to the inclusion of an 
equality sign in the inequality above. A hyperplane partitions the //-dimensional 
space ( E n ) into two closed half-spaces, so that 

H+ = {X I a T X > b) (3.9) 

H~ = {X | a T X < b) (3.10) 

This is illustrated in Fig. 3.8 in the case of a two-dimensional space ( E 2 ). 


A B 

J I L_ 

X(D XM X<2) 

Figure 3.7 Line segment. 


0 
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*2 


Hyperplane 

x 1 

Figure 3.8 Hyperplane in two dimensions. 

4. Convex set. A convex set is a collection of points such that if X (1) and X ,2> are 
any two points in the collection, the line segment joining them is also in the 
collection. A convex set, S, can be defined mathematically as follows: 

If X (1) ,X (2) e S, then X e S 

where 

X =AX (1) + (1 -1)X (2) , 0 < A < 1 

A set containing only one point is always considered to be convex. Some 
examples of convex sets in two dimensions are shown shaded in Fig. 3.9. On 
the other hand, the sets depicted by the shaded region in Fig. 3.10 are not 
convex. The L-shaped region, for example, is not a convex set because it is 
possible to find two points a and b in the set such that not all points on the line 
joining them belong to the set. 

5. Convex polyhedron and convex poly tope. A convex polyhedron is a set of points 
common to one or more half-spaces. A convex polyhedron that is bounded is 
called a convex polytope. 

Figure 3.11a and b represents convex polytopes in two and three dimensions, 
and Fig. 3.11c and d denotes convex polyhedra in two and three dimensions. It 








Figure 3.10 Nonconvex sets. 
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x 2 

A 



Figure 3.11 Convex polytopes in two and three dimensions (a,b) and convex polyhedra in 
two and three dimensions (c, d). 


can be seen that a convex polygon, shown in Fig. 3. 1 1 a and c, can be considered 
as the intersection of one or more half-planes. 

6 . Vertex or extreme point. This is a point in the convex set that does not lie on a 
line segment joining two other points of the set. For example, every point on 
the circumference of a circle and each corner point of a polygon can be called 
a vertex or extreme point. 

7. Feasible solution. In a linear programming problem, any solution that satisfies 
the constraints 

aX = b (3.2) 

X > 0 (3.3) 

is called a feasible solution. 

8 . Basic solution. A basic solution is one in which n — m variables are set equal 
to zero. A basic solution can be obtained by setting n — m variables to zero and 
solving the constraint Eqs. (3.2) simultaneously. 

9. Basis. The collection of variables not set equal to zero to obtain the basic 
solution is called the basis. 
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10. Basic feasible solution. This is a basic solution that satisfies the nonnegativity 
conditions of Eq. (3.3). 

11. Nondegenerate basic feasible solution. This is a basic feasible solution that has 
got exactly m positive x, . 

12. Optimal solution. A feasible solution that optimizes the objective function is 
called an optimal solution. 

13. Optimal basic solution. This is a basic feasible solution for which the objective 
function is optimal. 


Theorems. The basic theorems of linear programming can now be stated and proved 

f 


Theorem 3.1 The intersection of any number of convex sets is also convex. 


Proof: Let the given convex sets be represented as Rfi = 1,2,..., K) and their 
intersection as R, so that* * 

K 

i = 1 


If the points X ' 11 , X 1 21 e R. then from the definition of intersection, 

X = AX (1) + (1 - A)X (2) e Ri (t = 1, 2 , ... , K) 
0 < X < l 


Thus 

K 

X e R = f)Ri 

i = 1 

and the theorem is proved. Physically, the theorem states that if there are a number of 
convex sets represented by Ri, R 2 , . . ., the set of points R common to all these sets 
will also be convex. Figure 3.12 illustrates the meaning of this theorem for the case 
of two convex sets. 


Theorem 3.2 The feasible region of a linear programming problem is convex. 



+ The proofs of the theorems are not needed for an understanding of the material presented in subsequent 
sections. 

*The symbol n represents the intersection of sets. 
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Proof-. The feasible region S of a standard linear programming problem is debned as 



S = {X | aX = b, X > 0} 

(3.11) 

Let the points X i 

and X 2 belong to the feasible set S so that 



aX, = b, X! > 0 

(3.12) 


aX? = b. X 2 > 0 

(3.13) 

Multiply Eq. (3.12) by A and Eq. (3.13) by (1 — A) and add them to obtain 



a[AXj + (l - a)X 2 ] = Ab + (l - A)b = b 


that is, 

aX* = b 


where 

X^ = AX 1 + (1 — A)X 2 


Thus the point X/ 

satisfies the constraints and if 



0 < A < 1, X, >0 



Hence the theorem is proved. 

Theorem 3.3 Any local minimum solution is global for a linear programming problem. 

Proof-. In the case of a function of one variable, the minimum (maximum) of a function 
f(x) is obtained at a value x at which the derivative is zero. This may be a point like 
A( x — x i ) in Fig. 3.13, where f (x ) is only a relative (local) minimum, or a point like 
B{x — xo), where / (x ) is a global minimum. Any solution that is a local minimum 
solution is also a global minimum solution for the linear programming problem. To see 
this, let A be the local minimum solution and assume that it is not a global minimum 
solution so that there is another point B at which fs < f a- Let the coordinates of A 
and B be given by {x\, X 2 , ■ ■ ■ , x„ } T and {_vi ,y 2 , , y«} T , respectively. Then any point 
C = {zi, zi, ■ ■ ■ , Z«} T that lies on the line segment joining the two points A and B is 


fix) 



Figure 3.13 Local and global minima. 
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a feasible solution and fc — A ./a + (1 — /.)/«. In this case, the value of / decreases 
uniformly from f,\ to f B , and thus all points on the line segment between A and B 
(including those in the neighborhood of A) have / values less than / a and correspond 
to feasible solutions. Hence it is not possible to have a local minimum at A and at the 
same time another point B such that f A > f B . This means that for all B. f A < f B , so 
that // 1 is the global minimum value. 

The generalized version of this theorem is proved in Appendix A so that it can be 
applied to nonlinear programming problems also. 

Theorem 3.4 Every basic feasible solution is an extreme point of the convex set of 
feasible solutions. 

Theorem 3.5 Let S be a closed, bounded convex polyhedron with X' , / = 1 to p, as 
the set of its extreme points. Then any vector X e S can be written as 

x = f> x r 

i=t 

Xj > 0 
p 

x> = 1 

(=1 

Theorem 3.6 Let S be a closed convex polyhedron. Then the minimum of a linear 
function over S is attained at an extreme point of S. 

The proofs of Theorems 3.4 to 3.6 can be found in Ref. [3.1]. 


3.6 SOLUTION OF A SYSTEM OF LINEAR SIMULTANEOUS 
EQUATIONS 

Before studying the most general method of solving a linear programming problem, it 
will be useful to review the methods of solving a system of linear equations. Hence 
in the present section we review some of the elementary concepts of linear equations. 
Consider the following system of n equations in n unknowns: 


at 1*1 + 012*2 + • ■ 

• -J- d \ n X n 

= b\ 

(Ei) 


021*1 + 022*2 + • ■ 

• • + Cl2nXn 

= b 2 

(E 2 ) 


<231*1 + #32*2 + ' 1 

Clf,fqXfl 

= h 

(E 3 ) 

(3.14) 

On 1*1 + <J ( h2*2 + ' 

• • -b d nn Xn 

= b n 

(E„) 



Assuming that this set of equations possesses a unique solution, a method of solving 
the system consists of reducing the equations to a form known as canonical form . 

It is well known from elementary algebra that the solution of Eqs. (3.14) will not be 
altered under the following elementary operations: (1) any equation E r is replaced by 
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the equation kE , where k is a nonzero constant, and (2) any equation E, is replaced by 
the equation E r + kE s , where E s is any other equation of the system. By making use of 
these elementary operations, the system of Eqs. (3.14) can be reduced to a convenient 
equivalent form as follows. Let us select some variable x,- and try to eliminate it from all 
the equations except the /th one (for which ci j, is nonzero). This can be accomplished 
by dividing the jth equation by ap and subtracting au times the result from each of the 
other equations, k — 1,2,..., j — 1, j + 1, . . . , n. The resulting system of equations 
can be written as 

ci[ [X\ + a[ 2 x 2 + ■ • • + o[ j i JCj — t + 0 x ( - + a[ i+ ^Xi + 1 + ■ • ■ 

+ a[ n x n = b\ 

0-2 \X\ + 0 . 22 X 2 + ‘ • ’ + a 2j_ iXj-1 + 0 Xj + a' 2i+l Xi + i + ' • ■ 

+ a' ln Xn = b' 2 

a 'j-l.\X\ + Oj- l,2 x 2 + • • • + Oj_ 1 ( _1 + 0 Xi + 

H f a'j_ l n x n = bk_ x 

o'j\ x \ +a' j2 xi H h«y,i-t^i-i + 1 Xi +a' j i+l x i+ \ 

+ • • • + a' jn x n — b j 

a 'j + 1 , 1*1 + a j + i'2 x 2 + ' ' ' + a j + lj-l x i-l + 0 X( + a'j +l i+1 Xi + 1 

■+ ^ a j+l, n Xn = b' j+ 1 

o' n \ x \ + o' nl X2 + ■ ■ ■ + a! ni _\Xi-\ + 0 Xj + a' ni+l Xi + \ + ■ ■ ■ 

+ a 'nn X n = K ( 3 - 15 ) 

where the primes indicate that the a - ■ and b'- are changed from the original system. 
This procedure of eliminating a particular variable from all but one equations is called 
a pivot operation. The system of Eqs. (3.15) produced by the pivot operation have 
exactly the same solution as the original set of Eqs. (3.14). That is, the vector X that 
satisfies Eqs. (3.14) satisfies Eqs. (3.15), and vice versa. 

Next time, if we take the system of Eqs. (3.15) and perform a new pivot operation 
by eliminating x s , s ^ i, in all the equations except the fth equation, t ^ j, the zeros 
or the 1 in the /th column will not be disturbed. The pivotal operations can be repeated 
by using a different variable and equation each time until the system of Eqs. (3.14) is 
reduced to the form 

Ixi -p 0x2 ”f~ OX 3 + • • • + Ox tl — //| 

Oxi 1x2 4" 0 x 3 4~ • ■ ■ 4” Oxjt — b^ 
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0x\ 0x2 ~b IX 3 ~b ■ • ■ ~b 0x n — b j (3.16) 


0 x\ + 0X2 + 0x3 + • • • + lx ;! — b” 

This system of Eqs. (3.16) is said to be in canonical form and has been obtained after 
carrying out n pivot operations. From the canonical form, the solution vector can be 
directly obtained as 

Xi =b ", i = 1 , 2, . . . , n (3.17) 

Since the set of Eqs. (3.16) has been obtained from Eqs. (3.14) only through elementary 
operations, the system of Eqs. (3.16) is equivalent to the system of Eqs. (3.14). Thus 
the solution given by Eqs. (3.17) is the desired solution of Eqs. (3.14). 


3.7 PIVOTAL REDUCTION OF A GENERAL SYSTEM OF 
EQUATIONS 

Instead of a square system, let us consider a system of m equations in n variables with 
n > m. This system of equations is assumed to be consistent so that it will have at 
least one solution: 


anxi + cinxi H b ainX„ = b\ 

Cl2\X\ + @22X2 H b a2nXn = ^2 

. (3.18) 

m 1 X ] + U m 2X2 “b ' ' ■ “b dmnXn — 

The solution vector(s) X that satisfy Eqs. (3.18) are not evident from the equations. 
However, it is possible to reduce this system to an equivalent canonical system from 
which at least one solution can readily be deduced. If pivotal operations with respect 
to any set of m variables, say, x\, X 2 , ■ ■ ■ , x m , are carried, the resulting set of equations 
can be written as follows: 


Canonical system with pivotal variables xi, X 2 , . . . , 

Xm 

lxi + 0 X 2 + • ' 

' • + 0x m -{- 1 + ‘ * 

• + fl" n x„ 

= b'[ 

0*1 + \X2 + * ■ 

' • + 0x m + ’ * 

“b ^2 n^n 

= b'{ (3.19) 

Oxi + 0^2 ~\~ • ■ 

■ • + lx m + a" nm+l x m+ 1 + • 

■ ■ + a mn X < 

— h" 

i ~ u m 

Pivotal 

Nonpivotal or 


Constants 

variables 

independent 




variables 
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One special solution that can always be deduced from the system of Eqs. (3.19) is 

[ b" i = 1,2, ... ,m 

Xi = (3.20) 

[ 0 , i — m + 1. m + 2, ... , n 

This solution is called a basic solution since the solution vector contains no more 
than m nonzero terms. The pivotal variables x, , i — 1,2, ... , m, are called the basic 
variables and the other variables x,-, i — m + 1 , m + 2, ... , n, are called the nonbasic 
variables. Of course, this is not the only solution, but it is the one most readily deduced 
from Eqs. (3.19). If all b", i — 1,2 , ... ,m, in the solution given by Eqs. (3.20) are 
nonnegative, it satisfies Eqs. (3.3) in addition to Eqs. (3.2), and hence it can be called 
a basic feasible solution . 

It is possible to obtain the other basic solutions from the canonical system of Eqs. 
(3.19). We can perform an additional pivotal operation on the system after it is in 
canonical form, by choosing a" pq (which is nonzero) as the pivot term, q >m, and 
using any row p (among 1,2,..., m). The new system will still be in canonical form 
but with x q as the pivotal variable in place of x p . The variable x p , which was a basic 
variable in the original canonical form, will no longer be a basic variable in the new 
canonical form. This new canonical system yields a new basic solution (which may or 
may not be feasible) similar to that of Eqs. (3.20). It is to be noted that the values of 
all the basic variables change, in general, as we go from one basic solution to another, 
but only one zero variable (which is nonbasic in the original canonical form) becomes 
nonzero (which is basic in the new canonical system), and vice versa. 

Example 3.3 Find all the basic solutions corresponding to the system of equations 


2X[ + 3 x 2 — 2x3 — 7 X 4 = 1 

do) 

Xi + X2 + X3 + 3X4 = 6 

(Do) 

Xi — X2 + X3 + 5x4 = 4 

dllo) 


SOLUTION First we reduce the system of equations into a canonical form with x\, 
X 2 , and X 3 as basic variables. For this, first we pivot on the element an = 2 to obtain 


XI + |x 2 - X 3 - 5X4 = \ 

It 

= ^0 

0 - \x2 + 2x3 + Y*4 = Y 

Hi 

= Ho - II 

0 - §x 2 + 2x 3 + yM = \ 

Hd 

= HIo - Ii 

a' 22 = —j, to obtain 



x\ 0 -f - 5x3 T - 16x4 = 17 

h 

= it - |n 2 

0 + X2 — 4 X 3 — 13 x 4 = — 1 1 

h 2 

= -2111 

0 + 0 — 8x3 — 24 x 4 = —24 

m 2 

= IIIi + f II 2 
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Finally we pivot on a ', 3 to obtain the required canonical form as 

X\ + X4 —2 I3 =: I2 — 51113 

x 2 - x 4 = 1 n 3 = n 2 + 4 iii 3 

x 3 + 3 x 4 = 3 IH 3 = -i ih 2 

From this canonical form, we can readily write the solution of x\, x 2 , and x 3 in terms 
of the other variable x 4 as 


x\ = 2 — x 4 

X2 — 1 + X4 

X3 = 3 — 3x4 

If Eqs. (Io), (II 0 ), and (IIIo) are the constraints of a linear programming problem, the 
solution obtained by setting the independent variable equal to zero is called a basic 
solution. In the present case, the basic solution is given by 

x\ — 2 , x 2 = 1 , x 3 = 3 (basic variables) 

and X4 = 0 (nonbasic or independent variable). Since this basic solution has all x j > 
0 (j = 1 , 2 , 3 , 4 ), it is a basic feasible solution. 

If we want to move to a neighboring basic solution, we can proceed from the 
canonical form given by Eqs. (I 3 ), (II 3 ), and (III 3 ). Thus if a canonical form in terms 
of the variables x\, x 2 , and X4 is required, we have to bring X4 into the basis in place 
of the original basic variable x 3 . Hence we pivot on a" 4 in Eq. (III 3 ). This gives the 
desired canonical form as 

X] — |x 3 = 1 I4 = I 3 — III4 

x 2 T - ^x 3 — 2 II4 — II 3 -f III4 

X4 + 5X3 = 1 III4 = jIH 3 

This canonical system gives the solution of xi, x 2 , and X4 in terms of x 3 as 

Xi = 1 + 5X3 

x 2 = 2 - 5X3 

X4 = 1 — |x 3 

and the corresponding basic solution is given by 

x\ = 1, x 2 = 2, X4 = 1 (basic variables) 
x 3 = 0 (nonbasic variable) 

This basic solution can also be seen to be a basic feasible solution. If we want to move 
to the next basic solution with xi, x 3 , and X4 as basic variables, we have to bring x 3 
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into the current basis in place of x 2 . Thus we have to pivot a'^ in Eq. (II 4 ). This leads 
to the following canonical system: 

xi + x 2 = 3 

I 5 = I 4 + 3 II 5 

X3 + 3 x 2 = 6 

H 5 - 3114 

x 4 — x 2 = — 1 

m 5 = iii 4 - in 5 


The solution for xi, xj, and X 4 is given by 


Xi — 3 - X2 
X 3 = 6 — 3.^2 
X 4 = — 1 + x 2 

from which the basic solution can be obtained as 

x\ — 3, X 3 = 6 , x 4 = — I (basic variables) 
x 2 — 0 (nonbasic variable) 

Since all the x ; - are not nonnegative, this basic solution is not feasible. 

Finally, to obtain the canonical form in terms of the basic variables x 2 , X 3 , and x 4 , 

we pivot on a " 2 in Eq. (I 5 ), thereby bringing x 2 into the current basis in place of x\. 

This gives 

x 2 + X] = 3 16 — I 5 

x 3 - 3xi = -3 II 6 = II 5 - 3I 6 

X 4 3" X] — 2 111(5 = IUg -f- 1(5 

This canonical form gives the solution for x 2 , X 3 , and x 4 in terms of xj as 

x 2 — 3 x\ 

X 3 = — 3 + 3xi 
x 4 — 2 — x\ 

and the corresponding basic solution is 

x 2 = 3, X 3 = — 3, x 4 = 2 (basic variables) 

xi = 0 (nonbasic variable) 

This basic solution can also be seen to be infeasible due to the negative value for X 3 . 


3.8 MOTIVATION OF THE SIMPLEX METHOD 

Given a system in canonical form corresponding to a basic solution, we have seen how 
to move to a neighboring basic solution by a pivot operation. Thus one way to find the 
optimal solution of the given linear programming problem is to generate all the basic 
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solutions and pick the one that is feasible and corresponds to the optimal value of the 
objective function. This can be done because the optimal solution, if one exists, always 
occurs at an extreme point or vertex of the feasible domain. If there are m equality 
constraints in n variables with n > m, a basic solution can be obtained by setting any 
of the n — m variables equal to zero. The number of basic solutions to be inspected is 
thus equal to the number of ways in which m variables can be selected from a set of 
n variables, that is, 


For example, if n — 10 and m — 5, we have 252 basic solutions, and if n — 20 and 
m — 10, we have 184,756 basic solutions. Usually, we do not have to inspect all these 
basic solutions since many of them will be infeasible. Flowever, for large values of n 
and m, this is still a very large number to inspect one by one. Flence what we really need 
is a computational scheme that examines a sequence of basic feasible solutions, each 
of which corresponds to a lower value of / until a minimum is reached. The simplex 
method of Dantzig is a powerful scheme for obtaining a basic feasible solution; if the 
solution is not optimal, the method provides for finding a neighboring basic feasible 
solution that has a lower or equal value of /. The process is repeated until, in a finite 
number of steps, an optimum is found. 

The first step involved in the simplex method is to construct an auxiliary prob- 
lem by introducing certain variables known as artificial variables into the standard 
form of the linear programming problem. The primary aim of adding the artificial 
variables is to bring the resulting auxiliary problem into a canonical form from which 
its basic feasible solution can be obtained immediately. Starting from this canonical 
form, the optimal solution of the original linear programming problem is sought in 
two phases. The first phase is intended to find a basic feasible solution to the orig- 
inal linear programming problem. It consists of a sequence of pivot operations that 
produces a succession of different canonical forms from which the optimal solution 
of the auxiliary problem can be found. This also enables us to find a basic feasible 
solution, if one exists, of the original linear programming problem. The second phase 
is intended to find the optimal solution of the original linear programming problem. 
It consists of a second sequence of pivot operations that enables us to move from 
one basic feasible solution to the next of the original linear programming problem. 
In this process, the optimal solution of the problem, if one exists, will be identified. 
The sequence of different canonical forms that is necessary in both the phases of 
the simplex method is generated according to the simplex algorithm described in the 
next section. That is, the simplex algorithm forms the main subroutine of the simplex 
method. 


The starting point of the simplex algorithm is always a set of equations, which includes 
the objective function along with the equality constraints of the problem in canonical 
form. Thus the objective of the simplex algorithm is to find the vector X > 0 that 
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minimizes the function /(X) and satisfies the equations: 


lxi + 0x2 + • ' 

■ • + 0x m 

+ a \,m + l X m+\ + ' 

■ ■ + a'[ n x„ 

- K 

Oxi T- Ixt -|- • ■ 

0x m 

+ a 2,m+l Xm +l + ' 

■ ■ + a'{ n x n 

= K 

Oxi + 0.X2 + • ■ 

‘ ’ + 1 %m 

+ a m,m+l X m+l + ' 

' ■ ■ + a mn x n 

- K 

Oxi + 0.X2 + • ■ 

■ • ~b 0 Xjyi 

-/ 

+ c" +1 X m + 1 -| 

■ + c mn x n 



(3.21) 


where a", c", b" , and / 0 " are constants. Notice that (— /) is treated as a basic variable 
in the canonical form of Eqs. (3.21). The basic solution that can readily be deduced 
from Eqs. (3.21) is 

Xi — b" , i — 1,2 , ,m 

f = fs (3-22) 

Xi = 0, i = m + 1 , m + 2, . . . , n 

If the basic solution is also feasible, the values of x,-, i — 1,2,..., n, are nonnegative 
and hence 


b"i> 0, i = 1,2,..., m (3.23) 

In phase I of the simplex method, the basic solution corresponding to the canonical form 
obtained after the introduction of the artificial variables will be feasible for the auxiliary 
problem. As stated earlier, phase II of the simplex method starts with a basic feasible 
solution of the original linear programming problem. Hence the initial canonical form 
at the start of the simplex algorithm will always be a basic feasible solution. 

We know from Theorem 3.6 that the optimal solution of a linear programming 
problem lies at one of the basic feasible solutions. Since the simplex algorithm is 
intended to move from one basic feasible solution to the other through pivotal oper- 
ations, before moving to the next basic feasible solution, we have to make sure that 
the present basic feasible solution is not the optimal solution. By merely glancing at 
the numbers c", j — 1 , 2 ,..., n, we can tell whether or not the present basic feasible 
solution is optimal. Theorem 3.7 provides a means of identifying the optimal point. 


3.9.1 Identifying an Optimal Point 

Theorem 3.7 A basic feasible solution is an optimal solution with a minimum objec- 
tive function value of / ( " if all the cost coefficients c", j — m + 1 , m + 2, . . . , n, in 
Eqs. (3.21) are nonnegative. 

Proof: From the last row of Eqs. (3.21), we can write that 

fo + t c ‘ Xi = f 

i=m + 1 


(3.24) 
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Since the variables x m+ \, x m+ i , . . . , x n are presently zero and are constrained to be 
nonnegative, the only way any one of them can change is to become positive. But if 
c" > 0 for i = m + l,m +2, ..., n, then increasing any x, cannot decrease the value 
of the objective function /. Since no change in the nonbasic variables can cause / to 
decrease, the present solution must be optimal with the optimal value of / equal to / 0 ". 

A glance over c" can also tell us if there are multiple optima. Let all c">0, 
i = m + 1 , m + 2, . . . , k — l,k + \ n, and let c'l — 0 for some nonbasic variable 

Xk . Then if the constraints allow that variable to be made positive (from its present 

value of zero), no change in / results, and there are multiple optima. It is possible, 
however, that the variable may not be allowed by the constraints to become positive; 
this may occur in the case of degenerate solutions. Thus as a corollary to the discussion 
above, we can state that a basic feasible solution is the unique optimal feasible solution 
if c"> 0 for all nonbasic variables xj, j = m + 1, m + 2, . . . , n. If, after testing for 
optimality, the current basic feasible solution is found to be nonoptimal, an improved 
basic solution is obtained from the present canonical form as follows. 

3.9.2 Improving a Nonoptimal Basic Feasible Solution 

From the last row of Eqs. (3.21), we can write the objective function as 

m n 

f = fo +Y, c 'i Xi + zC c "i x i 

tT j ^+ 1 (3-25) 

= /" for the solution given by Eqs. (3.22) 

If at least one c" is negative, the value of / can be reduced by making the corresponding 
Xj >0. In other words, the nonbasic variable Xj, for which the cost coefficient c” is 
negative, is to be made a basic variable in order to reduce the value of the objective 
function. At the same time, due to the pivotal operation, one of the current basic 
variables will become nonbasic and hence the values of the new basic variables are to be 
adjusted in order to bring the value of / less than /,". If there are more than one c "j < °’ 
the index .v of the nonbasic variable x s which is to be made basic is chosen such that 

c” = minimum c” < 0 (3.26) 

Although this may not lead to the greatest possible decrease in / (since it may not be 
possible to increase x s very far), this is intuitively at least a good rule for choosing the 
variable to become basic. It is the one generally used in practice because it is simple 
and it usually leads to fewer iterations than just choosing any c" < 0. If there is a 
tie-in applying Eq. (3.26), (i.e., if more than one c" has the same minimum value), 
we select one of them arbitrarily as c". 

Having decided on the variable x s to become basic, we increase it from zero, 
holding all other nonbasic variables zero, and observe the effect on the current basic 
variables. From Eqs. (3.21), we can obtain 

x, = b'[-a" u x % , b" > 0 

X 2 — b '2 — a'{ s x s , b '2 > 0 


(3.27) 
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(3.28) 


Since c" < 0, Eq. (3.28) suggests that the value of x s should be made as large as 
possible in order to reduce the value of / as much as possible. However, in the process 
of increasing the value of x s , some of the variables x, (i — 1,2,..., m ) in Eqs. (3.27) 
may become negative. It can be seen that if all the coefficients a" s < 0, i = 1, 2, . . . , m, 
then x s can be made infinitely large without making any x t <0, i — 1,2 , ,m. In 
such a case, the minimum value of / is minus infinity and the linear programming 
problem is said to have an unbounded solution . 

On the other hand, if at least one ci" s is positive, the maximum value that x s can 
take without making x, negative is b'f/a" s . If there are more than one a" s > 0, the 
largest value x* that x s can take is given by the minimum of the ratios b" /a" s for 
which a” s > 0. Thus 


The choice of r in the case of a tie, assuming that all b" > 0, is arbitrary. If any b" 
for which a" s > 0 is zero in Eqs. (3.27), x s cannot be increased by any amount. Such 
a solution is called a degenerate solution . 

In the case of a nondegenerate basic feasible solution, a new basic feasible solu- 
tion can be constructed with a lower value of the objective function as follows. By 
substituting the value of x* given by Eq. (3.29) into Eqs. (3.27) and (3.28), we obtain 


which can readily be seen to be a feasible solution different from the previous one. 
Since a" s > 0 in Eq. (3.29), a single pivot operation on the element a", in the system 
of Eqs. (3.21) will lead to a new canonical form from which the basic feasible solution 
of Eqs. (3.30) can easily be deduced. Also, Eq. (3.31) shows that this basic feasible 
solution corresponds to a lower objective function value compared to that of Eqs. (3.22). 
This basic feasible solution can again be tested for optimality by seeing whether all 
c" > 0 in the new canonical form. If the solution is not optimal, the entire procedure 
of moving to another basic feasible solution from the present one has to be repeated. 
In the simplex algorithm, this procedure is repeated in an iterative manner until the 
algorithm finds either (1) a class of feasible solutions for which / — >• — oo or (2) an 
optimal basic feasible solution with all c" > 0, i = 1, 2, . . . , n. Since there are only 
a finite number of ways to choose a set of m basic variables out of n variables, the 
iterative process of the simplex algorithm will terminate in a finite number of cycles. 
The iterative process of the simplex algorithm is shown as a flowchart in Fig. 3.14. 



(3.29) 



(3.30) 


(3.31) 
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Figure 3.14 Flowchart for finding the optimal solution by the simplex algorithm. 
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Example 3.4 

Maximize F = x\ + 2x2 + x 3 

subject to 

2xi + X2 — X3 <2 
— 2xi + X2 — 5 x 3 > — 6 
4 xi + X2 + x 3 <6 
X; > 0 , i — 1 , 2 , 3 

SOLUTION We first change the sign of the objective function to convert it to a 
minimization problem and the signs of the inequalities (where necessary) so as to 
obtain nonnegative values of b t (to see whether an initial basic feasible solution can 
be obtained readily). The resulting problem can be stated as 

Minimize / = — xi — 2x2 — x 3 

subject to 

2xi + X2 — X3 < 2 
2xi — X 2 + 5x3 < 6 
4 xi + X2 + X3 < 6 

x, ■ > 0 , i = 1 to 3 

By introducing the slack variables X4 > 0 , X5 > 0 , and x f) > 0 , the system of equations 
can be stated in canonical form as 


2 xi 

+ 

x 2 

- 

x 3 

+ 

X 4 

= 2 

2 xi 

- 

X2 

+ 

5x3 

+ 

x 5 

= 6 

4xi 

+ 

x 2 

+ 

x 3 

+ 

X 6 

= 6 

-X] 

- 

2 x 2 

- 

x 3 



-/ =0 


where X4, X5, X6, and — / can be treated as basic variables. The basic solution corre- 
sponding to Eqs. (Ei) is given by 

X4 = 2 , X5 = 6, xg ~ 6 (basic variables) 

xi = X2 = X3 = 0 (nonbasic variables) (E2) 

/ = 0 

which can be seen to be feasible. 

Since the cost coefficients corresponding to nonbasic variables in Eqs. (Ei) are 
negative (c" = — I , cj, 7 = — 2 , c'j — — 1 ), the present solution given by Eqs. (E2) is not 
optimum. To improve the present basic feasible solution, we first decide the variable 
(x s ) to be brought into the basis as 

c" — min(c" < 0) = c'[ = —2 
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Thus X2 enters the next basic set. To obtain the new canonical form, we select the pivot 
element a” s such that 

K . (K\ 

— = mm — 

< <>0\ a iJ 

In the present case, s — 2 and a " 2 and a 22 are > 0 . Since b"/a " 2 — 2/1 and b" / a 22 — 
6 / 1 , x r = x\ . By pivoting an a" 2 , the new system of equations can be obtained as 

2 x\ + 1^2 — X3 + X4 
4X{ + 0X2 + 4x3 + x 4 + x 5 
2 xi + 0x2 + 2.X3 — X4 + X6 
3 xi + 0x2 — 3x3 + 2x4 

The basic feasible solution corresponding to this canonical form is 



X2 = 2 , X5 = 8, X6 = 4 (basic variables) 

x l = X3 = X4 = 0 (nonbasic variables) (E4) 

/ = - 4 

Since c' 2 — — 3 , the present solution is not optimum. As c" = m i n {c" < 0 ) = c", x v = X3 
enters the next basis. 

To find the pivot element a" s , we find the ratios b" /a" s for a." > 0 . In Eqs. (E3), 
only a / 3 and a" 2 are > 0, and hence 

h 1 ' _ 8 and b” _ 4 

a 23 4 a'^ 2 

Since both these ratios are same, we arbitrarily select a/ 3 as the pivot element. Pivoting 
on a 2 2 gives the following canonical system of equations: 

3 xi + 1X2 + 0x3 + f *4 + 5X5 = 4 

lxi + 0 x 2 + 1x3 + J x 4 + 2X5 —2 

3 t (E5 ^ 

O.Xi + 0X2 + 0x3 — 2 X4 ~~ ~ X 5 + x 6 — 0 

6X[ + 0 x 2 + 0X3 + T X 4 + | x 5 - / = 10 

The basic feasible solution corresponding to this canonical system is given by 

x 2 = 4 , X3 = 2 , Xf, = 0 (basic variables) 

x t = X4 = X5 = 0 (nonbasic variables) (Eg) 

/ = -io 

Since all c" are > 0 in the present canonical form, the solution given in (Eg) will be 
optimum. Usually, starting with Eqs. (Ej ), all the computations are done in a tableau 
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form as shown below: 


Basic 

variables 



Variables 




-/ 

b'! 

Vi K for 
4 > o 

XI 

X 2 

X3 

X4 

X5 

X6 

X4 

2 

□ 

-1 

i 

0 

0 

0 

2 

2 •«— Smaller one 



Pivot 







(X4 drops from 



element 







next basis) 

x 5 

2 

-1 

5 

0 

1 

0 

0 

6 


X6 

4 

1 

1 

0 

0 

1 

0 

6 

6 

-/ 

-1 

-2 

-1 

0 

0 

0 

1 

0 




t 










Most negative c\ 

' (x 2 enters next basis) 



Result of pivoting: 









X 2 

2 

1 

-1 

1 

0 

0 

0 

2 


X5 

4 

0 

0 

1 

1 

0 

0 

8 

2 (Select this 




Pivot 






arbitrarily. X5 




element 






drops from next 










basis) 

X 6 

2 

0 

2 

-1 

0 

1 

0 

4 

2 

-/ 

3 

0 

-3 

2 

0 

0 

1 

4 





t 









Most negative c'( 

(x 3 enters the next basis) 



Result of pivoting: 









x 2 

3 

1 

0 

5 

4 

1 

4 

0 

0 

4 


x 3 

1 

0 

1 

1 

4 

1 

4 

0 

0 

2 


X6 

0 

0 

0 

3 

2 

1 

2 

1 

0 

0 


-/ 

6 

0 

0 

11 

4 

3 

4 

0 

1 

10 



All c" are > 0 and hence the present solution is optimum. 


Example 3.5 Unbounded Solution 

Minimize / = — 3 xi — 2x2 


x | — x 2 <1 
3xi — 2 x 2 < 6 
xi > 0, x 2 > 0 


subject to 
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SOLUTION Introducing the slack variables x$ > 0 and X4 > 0 , the given system of 
equations can be written in canonical form as 

Xi - x 2 + x 3 =1 

3x\ — 2x 2 + X 4 —6 (Ei) 

— 3xi — 2 x 2 — / = 0 

The basic feasible solution corresponding to this canonical form is given by 
x 3 — 1 , X 4 — 6 (basic variables) 

xj = x 2 — 0 (nonbasic variables) (E2) 

/ = 0 


Since the cost coefficients corresponding to the nonbasic variables are negative, the 
solution given by Eq. (Ei) is not optimum. Hence the simplex procedure is applied to 
the canonical system of Eqs. (Ei) starting from the solution, Eqs. (E2). The computa- 
tions are done in tableau form as shown below: 


Basic 


Variables 





KK for 

variables 

Xl 

X2 

X 3 

X 4 

-f 

b'! 

a'L > 0 

X3 

□ 

-1 

1 

0 

0 

1 

1 4 — Smaller value 


Pivot 






(X 3 leaves the 


element 






basis) 

X4 

3 

-2 

0 

1 

0 

6 

2 

-/ 

-3 

-2 

0 

0 

1 

0 



t 








Most negative 

c" (xi enters the next basis) 




Result of pivoting: 







Xl 

1 

-1 

1 

0 

0 

1 


X 4 

0 

□ 

-3 

1 

0 

3 

3 (X 4 leaves the 



Pivot 

element 





basis) 

-f 

0 

-5 

3 

0 

1 

3 




t 








Most negative 

c'( (x 2 enters the next basis) 


Result of pivoting: 







Xl 

1 

0 

-2 

1 

0 

4 

Both a" are 








negative (i.e., 
no variable 
leaves the basis) 

X 2 

0 

1 

-3 

1 

0 

3 


-/ 

0 

0 

-12 

5 

1 

18 



t 

Most negative c" (X3 enters the basis) 
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At this stage we notice that x 3 has the most negative cost coefficient and hence 
it should be brought into the next basis. However, since all the coefficients a" 3 are 
negative, the value of / can be decreased indefinitely without violating any of the 
constraints if we bring x 3 into the basis. Hence the problem has no bounded solution. 
In general, if all the coefficients of the entering variable x s (a" ) have negative or 
zero values at any iteration, we can conclude that the problem has an unbounded 
solution. 


Example 3.6 Infinite Number of Solutions To demonstrate how a problem having 
infinite number of solutions can be solved. Example 3.2 is again considered with a 
modified objective function: 


Minimize / = — 40x | — I 00x2 


subject to 

10xi + 5x2 < 2500 
4xi + 10x 2 < 2000 
2xi + 3 x 2 < 900 

Xi > 0, X 2 > 0 

SOLUTION By adding the slack variables x 3 > 0, X4 > 0 and X5 > 0, the equations 
can be written in canonical form as follows: 

10xi + 5 x 2 + X 3 = 2500 

4xi + 10 x 2 + X 4 = 2000 

2xj + 3x2 + X 5 = 900 

— 40xi — 100x2 — f — 0 

The computations can be done in tableau form as shown below: 


Basic 

variables 

Xi 

Variables 

X2 

x 3 

X4 

x 5 

-f 

b'( 

b" /a 

O 

A 

V 

5-1 

<2 

x 3 

10 

5 

1 

0 

0 

0 

2,500 

500 


X4 

4 

10 

Pivot 

element 

0 

1 

0 

0 

2,000 

200 4- 

- Smaller value 
(X4 leaves the 
basis) 

X 5 

2 

3 

0 

0 

1 

0 

900 

300 


-f 

-40 

-100 

0 

0 

0 

1 

0 




t 

Most negative c'( (x 3 enters the basis) 
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Result of pivoting: 


*3 

8 

0 

1 

1 

2 

0 

0 

1,500 

X2 

4 

10 

1 

0 

1 

10 

0 

0 

200 

X 5 

8 

10 

0 

0 

3 

10 

1 

0 

300 

-f 

0 

0 

0 

10 

0 

1 

20,000 


Since all c'l > 0 , the present solution is optimum. The optimum values are 
given by 

X 2 — 200, x 3 = 1500, *5 = 300 (basic variables) 

x\ — X 4 — 0 (nonbasic variables) 

/min = -20,000 

Important note: It can be observed from the last row of the preceding tableau that 
the cost coefficient corresponding to the nonbasic variable x\{c") is zero. This is an 
indication that an alternative solution exists. Here x\ can be brought into the basis and 
the resulting new solution will also be an optimal basic feasible solution. For example, 
introducing x\ into the basis in place of x 3 (i.e., by pivoting on a" 3 ), we obtain the 
new canonical system of equations as shown in the following tableau: 


Basic 



Variables 





b?K 

variables 

Xi 

*2 

x 3 

X 4 

*5 

-/ 

b'l 

O 

A 

V* 

xi 

1 

0 

1 

8 

1 

16 

0 

0 

1500 

8 


X 2 

0 

1 

1 

20 

1 

8 

0 

0 

125 


X 5 

0 

0 

1 

10 

1 

4 

1 

0 

150 


-f 

0 

0 

0 

10 

0 

1 

20,000 



The solution corresponding to this canonical form is given by 

xi = -^p, X 2 — 125, JC 5 = 150 (basic variables) 
x 3 = X 4 — 0 (nonbasic variables) 

/min = -20,000 

Thus the value of / has not changed compared to the preceding value since xi has a 
zero cost coefficient in the last row of the preceding tableau. Once two basic (optimal) 
feasible solutions, namely, 


0 


1500 

8 

200 


125 

1500 

and Xi = 

0 

0 


0 

300 


150 


150 
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are known, an infinite number of nonbasic (optimal) feasible solutions can be obtained 
by taking any weighted average of the two solutions as 


X* = kX! + (1 -A)X 2 




' (l-k)!f2 


-(l-A)lfT 

X* 


200X + (1 - A) 125 


125 +75 A 

X* 

■ — ■ 

1500A 

■ = - 

15 00 A 

< 


0 


0 



300A + (1 - A) 150 


150+ 150A 


0 < A < 1 

It can be verified that the solution X* will always give the same value of —20,000 for 
/ for all 0 < A < 1. 


3.10 TWO PHASES OF THE SIMPLEX METHOD 

The problem is to find nonnegative values for the variables x \ , x 2 , , x„ that satisfy 

the equations 


a\\x\ + a\ 2 x 2 H b a\„x„ — b\ 

d 2 \X\ + a 22 x 2 H b a 2n x n = b 2 

(3.32) 

dm 1 X ] + Cl m2 X 2 + ’ ’ ' + d mn X n — b m 

and minimize the objective function given by 

axi + c 2 x 2 b C n x„ = f (3.33) 

The general problems encountered in solving this problem are 

1. An initial feasible canonical form may not be readily available. This is the case 
when the linear programming problem does not have slack variables for some 
of the equations or when the slack variables have negative coefficients. 

2. The problem may have redundancies and/or inconsistencies, and may not be 
solvable in nonnegative numbers. 

The two-phase simplex method can be used to solve the problem. 

Phase I of the simplex method uses the simplex algorithm itself to find whether 
the linear programming problem has a feasible solution. If a feasible solution exists, 
it provides a basic feasible solution in canonical form ready to initiate phase II of the 
method. Phase II, in turn, uses the simplex algorithm to find whether the problem has 
a bounded optimum. If a bounded optimum exists, it finds the basic feasible solution 
that is optimal. The simplex method is described in the following steps. 
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1. Arrange the original system of Eqs. (3.32) so that all constant terms bj are 
positive or zero by changing, where necessary, the signs on both sides of any 
of the equations. 

2 . Introduce to this system a set of artificial variables yi,yi , ,y m (which serve 
as basic variables in phase I), where each y,- > 0, so that it becomes 

a n x\+a n x 2 -\ \-a\ n x n +y\ — b\ 

021*1 + 022*2 4 b 02«*n + J2 = £>2 

: (3.34) 

0ml*l 4“ 0/«2*2 4” ' ' * 4” 0m/i*« 4” y m — b m 

bi > 0 

Note that in Eqs. (3.34), for a particular i, the a,/s and the /;, may be the 
negative of what they were in Eq. (3.32) because of step 1. 

The objective function of Eq. (3.33) can be written as 

Cj*! + C 2 x 2 4 b C n x n + (-/) = 0 (3.35) 

3. Phase I of the method. Define a quantity w as the sum of the artificial variables 

w — yi +y 2 d b y m (3.36) 

and use the simplex algorithm to find *; > 0 (i = 1, 2, . . . , n ) and y,- > 0 (i = 
1,2 , ,m) which minimize w and satisfy Eqs. (3.34) and (3.35). Consequently, 
consider the array 

0n*t + 0 t 2*2 4 \-a\ n x n + yi — b x 

021*1 + 022*2 4 1- 02 m*« + y 2 — b 2 

: : (3.37) 

0ml*l T - 0m 2*2 "E * ’ ' 4” a mn x n + y m — b m 

CiXi + c 2 *2 4 b C n x n + (-/) = 0 

yi + yi 4 b y m + (-w) = 0 

This array is not in canonical form; however, it can be rewritten as a canonical 
system with basic variables y 1 , y 2 , ■■■, V m , — /, and —w by subtracting the sum 
of the first m equations from the last to obtain the new system 

01 1 JC 1 4-012*2 4 b 01„*„ 4- yi = b\ 

021*1 4- 022*2 4 b a 2 „Xn + y 2 — b 2 

n 4“ ym — bm 

+ (-/) = 0 

4- (-w) = 


0 »j 1*1 4 ” 0m2*2 4 ~ ■ ■ ■ 4 ~ a mn x t 

C\X\ 4- 02*2 4 b C n x n 

d\X\ + d 2 x 2 4- • • ■ 4- d n x n 


-w 0 


(3.38) 


152 Linear Programming I: Simplex Method 


where 


di = -(an + a 2 H h a mi ), i = 1,2 

— wq — —(b\ + b 2 + • • • + b m ) 


(3.39) 

(3.40) 


Equations (3.38) provide the initial basic feasible solution that is necessary for 
starting phase I. 

4. In Eq. (3.37), the expression of w, in terms of the artificial variables 
_y i , V 2 , . . . , y m is known as the infeasibility form, w has the property that if 
as a result of phase I, with a minimum of w >0, no feasible solution exists 
for the original linear programming problem stated in Eqs. (3.32) and (3.33), 
and thus the procedure is terminated. On the other hand, if the minimum of 
w — 0 , the resulting array will be in canonical form and hence initiate phase 
II by eliminating the w equation as well as the columns corresponding to each 
of the artificial variables yi, y 2 , ■ ■ ■ , y m from the array. 

5. Phase II of the method. Apply the simplex algorithm to the adjusted canonical 
system at the end of phase I to obtain a solution, if a finite one exists, which 
optimizes the value of f. 

The flowchart for the two-phase simplex method is given in Fig. 3.15. 


SOLUTION 

Step 1 As the constants on the right-hand side of the constraints are already nonneg- 
ative, the application of step 1 is unnecessary. 

Step 2 Introducing the artificial variables yi > 0 and y 2 > 0, the equations can be 
written as follows: 


Example 3.7 


Minimize / = 2x\ + 3 x 2 + 2 x 3 — X 4 + X 5 


subject to the constraints 


3xi — 3 xt + 4x3 + 2 x 4 — X 5 = 0 
X] + X2 + X3 + 3x4 + X5 = 2 

Xi >0, i = 1 to 5 


3 xi — 3x2 + 4x3 + 2x4 — *5 + y\ = 0 

xi + x 2 + X3 + 3 x 4 + X5 + y 2 —2 

2xi + 3 x 2 + 2x3 — X4 + X5 — f — 0 


= 0 


(Ei) 


Step 3 By defining the infeasibility form w as 


w = y i+y 2 
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Figure 3.15 Flowchart for the two-phase simplex method. 
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From block C 



Figure 3.15 (continued) 


the complete array of equations can be written as 


3 xi — + 4 x 3 + 2x4 — X5 + yi — 0 

Xl + X2 + X3 + 3X4 + X5 + V2 = 2 

2xi + 3x2 + 2x3 — X4 + X5 — f — 0 

yi + yi - w — 0 
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This array can be rewritten as a canonical system with basic variables as y\, 
yi, —f, and —w by subtracting the sum of the first two equations of (E2) from 
the last equation of (E2). Thus the last equation of (E2) becomes 


— 4 xi + 2x2 — 5x3 — 5x4 + 0x5 — w — —2 (E3) 


Since this canonical system [first three equations of (E2), and (E3)] provides 
an initial basic feasible solution, phase I of the simplex method can be started. 
The phase I computations are shown below in tableau form. 









Artificial 


Value of 

Basic 


Admissible variables 


variables 


KKs to 

variables x\ 

X2 


x 3 

X 4 

x 5 

yi 

>2 

b" 

a is > 0 

yi 

3 

-3 


4 

2 

-1 

1 

0 

0 

0 •«— Smaller value 






Pivot 





(yi drops from 






element 





next basis) 

yi 

1 

1 


1 

3 

1 

0 

1 

2 

2 

3 

-f 

2 

3 


2 

-1 

1 

0 

0 

0 


—w 

-4 

2 


-5 

-5 

0 

0 

0 

-2 






t 

t 










Most negative 







Since there is 

a tie between d!! 

and d'! 

d" is 

selected arbitrarily as the most 


negative d- 

for pivoting (X 4 enters the next basis). 





Result of pivoting: 








X 4 

3 

2 

3 

2 


2 

1 

1 

2 

1 

2 

0 

0 


yi 

7 

2 

11 


-5 

0 

5 

2 

3 

2 

1 

2 

XT drops 



^ivo 








from next 


element 







basis 

-f 

7 

2 

3 

2 


4 

0 

1 

2 

1 

2 

0 

0 


—w 

7 

2 

11 

2 


5 

0 

5 

2 

5 

2 

0 

-2 




t 










Most negative d" (X 2 enters next basis) 





Result of 

rivoting 

(since 

y\ and 

yi are 

dropped from basis, the columns cor- 


responding to 

them 

need 

not be filled): 





X4 

6 

11 

0 


7 

1 1 

1 

2 

11 

Dropped 

6 

1 1 

6 

2 

X2 

7 

11 

1 


10 
1 1 

0 

5 

11 



4 

11 

4 

5 

-f 

98 

22 

0 


118 

22 

0 

4 

22 



6 

11 


—w 

0 

0 


0 

0 

0 



0 
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Step 4 At this stage we notice that the present basic feasible solution does not contain 
any of the artificial variables yi and >'2, and also the value of w is reduced to 
0 . This indicates that phase I is completed. 

Step 5 Now we start phase II computations by dropping the w row from further 
consideration. The results of phase II are again shown in tableau form: 


Basic 

variables 

x\ 

Original variables 

X 2 X 3 X 4 

*5 

Constant 

v{ 

Value of b" /a" s for 

a i s > 0 

X 4 

*2 

6 

11 

7 

11 

0 

1 

7 

11 

10 

11 

1 

0 

p 

ele 

2 

11 

5 

11 

lVOt 

men 

6 

11 

4 
1 1 

t 

6 

2 

4 

5 

— Smaller value 
(xj drops from 
next basis) 

-/ 

98 

22 

0 

118 

22 

0 

4 

22 

6 

11 








t 








Most negative c" (X 5 enters next basis) 

Result of pivoting: 







X 4 

4 

5 

2 

5 

1 

1 

0 

2 

5 



xs 

7 

5 

11 

5 

-2 

0 

1 

4 

5 




21 

5 

2 

5 

5 

0 

0 

2 

5 




Now, since all c'( are nonnegative, phase II is completed. The (unique) optimal 
solution is given by 

xi = X2 = xj — 0 (nonbasic variables) 

X4 — |, *5 = 5 (basic variables) 

/min = 5 

3.11 MATLAB SOLUTION OF LP PROBLEMS 

The solution of linear programming problems, using simplex method, can be found as 
illustrated by the following example. 

Example 3.8 Find the solution of the following linear programming problem using 
MATLAB (simplex method): 

Minimize / = — x\ — 2x2 — xz 


subject to 


2xi + X2 — X3 <2 

2xi — X2 + 5x3 < 6 
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4*i +x 2 +x 3 < 6 
Xi >0; i = 1,2,3 


SOLUTION 


Step 1 Express the objective function in the form fix) = f'x and identify the vectors 
* and / as 



*1 


-1 

X = 

*2 

and / = 

-2 


*3 


-1 


Express the constraints in the form Ax < b and identify the matrix A and the 
vector b as 


2 

1 

-f 


'2 

2 

-1 

5 

and b = 

6 

4 

1 

1 


6 


Step 2 Use the command for executing linear programming program using simplex 
method as indicated below: 

cl c 

clear all 
f =[ - 1; - 2; - 1] ; 

A = [ 2 1 - 1; 

2-15; 

4 11]; 
b=[ 2; 6; 6] ; 

I b=zeros( 3, 1) ; 

Aeq=[ ] ; 
beq =[ ] ; 

options = opti mset(' LargeScal e' , 'off', 'Simplex 1 , 'on'); 

[ x, fval , exi tfl ag, output] = I i npr 0 g ( f , A, b, Aeq, beq, I b, [ ] , [ ] , 

0 pt i ms e t ( ' Di s pi a y ' , ' i t e r ' ) ) 

This produces the solution or output as follows: 

Opt i mi z at i on t er mi nat ed. 
x = 

0 

4 

2 

fval = 

- 10 

exi t f I ag = 

1 

output = 

i t er at i ons : 3 

algorithm: 1 me dium scale: simplex' 
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cgi terati ons: [] 

message: 'Optimization terminated. 
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REVIEW QUESTIONS 

3.1 Define a line segment in n-dimensional space. 

3.2 What happens when m = n in a (standard) LP problem? 

3.3 How many basic solutions can an LP problem have? 

3.4 State an LP problem in standard form. 

3.5 State four applications of linear programming. 

3.6 Why is linear programming important in several types of industries? 

3.7 Define the following terms: point, hyperplane, convex set, extreme point. 
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3.8 What is a basis? 

3.9 What is a pivot operation? 

3.10 What is the difference between a convex polyhedron and a convex polytope? 

3.11 What is a basic degenerate solution? 

3.12 What is the difference between the simplex algorithm and the simplex method? 

3.13 How do you identify the optimum solution in the simplex method? 

3.14 Define the infeasibility form. 

3.15 What is the difference between a slack and a surplus variable? 

3.16 Can a slack variable be part of the basis at the optimum solution of an LP problem? 

3.17 Can an artificial variable be in the basis at the optimum point of an LP problem? 

3.18 How do you detect an unbounded solution in the simplex procedure? 

3.19 How do you identify the presence of multiple optima in the simplex method? 

3.20 What is a canonical form? 

3.21 Answer true or false: 

(a) The feasible region of an LP problem is always bounded. 

(b) An LP problem will have infinite solutions whenever a constraint is redundant. 

(c) The optimum solution of an LP problem always lies at a vertex. 

(d) A linear function is always convex. 

(e) The feasible space of some LP problems can be nonconvex. 

(f) The variables must be nonnegative in a standard LP problem. 

(g) The optimal solution of an LP problem can be called the optimal basic solution. 

(h) Every basic solution represents an extreme point of the convex set of feasible solu- 
tions. 

(i) We can generate all the basic solutions of an LP problem using pivot operations. 

(j ) The simplex algorithm permits us to move from one basic solution to another basic 
solution. 

(k) The slack and surplus variables can be unrestricted in sign. 

(l) An LP problem will have an infinite number of feasible solutions. 

(m) An LP pioblem will have an infinite number of basic feasible solutions. 

(n) The right-hand-side constants can assume negative values during the simplex proce- 
dure. 

(o) All the right-hand-side constants can be zero in an LP problem. 

(p) The cost coefficient corresponding to a nonbasic variable can be positive in a basic 
feasible solution. 

(q) If all elements in the pivot column are negative, the LP problem will not have a 
feasible solution. 

(r) A basic degenerate solution can have negative values for some of the variables. 

(s) If a greater-than or equal-to type of constraint is active at the optimum point, the 
corresponding surplus variable must have a positive value. 

(t) A pivot operation brings a nonbasic variable into the basis. 
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(u) The optimum solution of an LP problem cannot contain slack variables in the basis. 

(v) If the infeasibility form has a nonzero value at the end of phase I, it indicates an 
unbounded solution to the LP problem. 

(w) The solution of an LP problem can be a local optimum. 

(x) In a standard LP problem, all the cost coefficients will be positive. 

(y) In a standard LP problem, all the right-hand-side constants will be positive. 

(z) In a LP problem, the number of inequality constraints cannot exceed the number of 
variables. 

(aa) A basic feasible solution cannot have zero value for any of the variables. 


PROBLEMS 


3.1 State the following LP problem in standard form: 

Maximize / = — 2xi — X 2 + 5^3 


subject to 


x\ — 2 x 2 + .*3 < 8 
3xi — 2x2 > — 18 
2xi + X2 — 2 a'3 < —4 


3.2 State the following LP problem in standard form: 

Maximize / = x\ — 8x2 

subject to 

3xi + 2x2 > 6 

9xi + 7x2 < 108 

2xi — 5x2 > —35 

xi , X2 unrestricted in sign 

3.3 Solve the following system of equations using pivot operations: 

6x1 — 2x2 + 3x3 = 11 
4xi + 7x2 + X 3 = 21 
5xi + 8x2 -f- 9x3 = 48 

3.4 It is proposed to build a reservoir of capacity xi to better control the supply of water to 
an irrigation district [3.15, 3.17]. The inflow to the reservoir is expected to be 4.5 x 10 6 
acre-ft during the wet (rainy) season and 1.1 x 10 6 acre-ft during the dry (summer) 
season. Between the reservoir and the irrigation district, one stream (A) adds water to 
and another stream ( B ) carries water away from the main stream, as shown in Fig. 3.16. 
Stream A adds 1.2 x 10 6 and 0.3 x 10 6 acre-ft of water during the wet and dry seasons, 
respectively. Stream B takes away 0.5 x 10 6 and 0.2 x 10 6 acre-ft of water during the 
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wet and dry seasons, respectively. Of the total amount of water released to the irrigation 
district per year ( xi ), 30 % is to be released during the wet season and 70 % during the 
dry season. The yearly cost of diverting the required amount of water from the main 
stream to the irrigation district is given by 18 ( 0 . 3 x 2 ) + 12(0.7x2). The cost of building 
and maintaining the reservoir, reduced to an yearly basis, is given by 25 xi . Determine 
the values of x\ and X2 to minimize the total yearly cost. 

3.5 Solve the following system of equations using pivot operations: 


4xi — 7x2 + 2 x 3 = — 8 
3 xi + 4x2 — 5x3 = — 8 
5 xi + X2 — 8x3 = —34 
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3.6 What elementary operations can be used to transform 

2x\ + x 2 + X3 = 9 

X\ + X2 + *3 = 6 

2x\ + 3x2 + *3 = 13 

into 

x\ = 3 
x 2 = 2 

xi + 3X2 + X 3 = 10 

Find the solution of this system by reducing into canonical form. 

3.7 Find the solution of the following LP problem graphically: 

Maximize / = 2x\ + 6x 2 

subject to 

— xi + x 2 < 1 
2xi + x 2 < 2 
xi > 0, x 2 > 0 

3.8 Find the solution of the following LP problem graphically: 

Minimize / = — 3 xi + 2 x 2 

subject to 

0 < xi < 4 

1 < x 2 < 6 
x\ + x 2 < 5 

3.9 Find the solution of the following LP problem graphically: 

Minimize / = 3 xi + 2 x 2 


8xi + x 2 > 8 
2xj + x 2 > 6 
x\ + 3 x 2 > 6 
xi + 6x 2 > 8 
xi >0, x 2 > 0 


subject to 
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3.10 Find the solution of the following problem by the graphical method: 

Minimize / = xfx% 


subject to 


x\x\ > e 
* 1*2 — e 
* 1*2 — ^ 


3 

4 


*1 > 0 , *2 > 0 


where e is the base of natural logarithms. 
3.11 Prove Theorem 3.6. 


For Problems 3.12 to 3.42, use a graphical procedure to identify (a) the feasible region, 
(b) the region where the slack (or surplus) variables are zero, and (C) the optimum 
solution. 

3.12 Maximize / = 6x + ly 
subject to 

lx + 6 y < 42 
5* + 9 y < 45 
x - y < 4 
x > 0, y > 0 

3.13 Rework Problem 3.12 when x and y are unrestricted in sign. 

3.14 Maximize / = \9x + ly 
subject to 

lx + 6y < 42 
5* + 9 y < 45 
x — y < 4 
x > 0, y > 0 

3.15 Rework Problem 3.14 when x and y are unrestricted in sign. 

3.16 Maximize / = x + 2y 
subject to 

x — y > — 8 
5* - y > 0 
x + y > 8 
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— x + 6y > 12 
5x + 2y < 68 
x < 10 

x > 0, y > 0 


3.17 Rework Problem 3.16 by changing the objective to Minimize / = x — y. 

3.18 Maximize f = x + 2y 

subject to 


x - y 

> 

-8 

5x — y 

> 

0 

x + y 

> 

8 

—x + 6y 

> 

12 

5x + 2 y 

> 

68 

X 

< 

10 

> 0, y 

> 

0 


3.19 Rework Problem 3.18 by changing the objective to Minimize / = x — y. 

3.20 Maximize / = x + 3y 

subject to 

—4x + 3y < 12 
x + y < 7 
x — 4y < 2 
x > 0, y > 0 


3.21 


subject to 


Minimize / = x + 3y 


—4x + 3 y < 12 
x + y <7 
x - 4y <2 

x and y are unrestricted in sign 


3.22 Rework Problem 3.20 by changing the objective to Maximize / = x + y. 
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3.23 


subject to 


3.24 

subject to 


Maximize / = x + 3y 


—4x + 3y < 12 
x + y <1 
x - 4y > 2 
x > 0, y > 0 

Minimize / = x — 8y 

3x + 2y > 6 
x — y < 6 
9a: + ly < 108 
3jc + 7y < 70 
2x - 5y > -35 
x > 0, y > 0 


3.25 Rework Problem 3.24 by changing the objective to Maximize / = x — 8y. 

3.26 Maximize f = x — 8y 


subject to 

3x + 2y > 6 
a: — y < 6 
9a: + 7y < 108 
3 a: + ly < 70 
2a: - 5y > -35 

a: > 0, y is unrestricted in sign 


3.27 


Maximize f = 5x — 2 y 


subject to 


3a: + 2y > 6 
a: — y < 6 
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9x + 7y < 108 
3x + ly < 70 
2x - 5y > -35 
x > 0, y > 0 

Minimize / = x — 4 y 

x - y > -4 
4x + 5y < 45 
5x - 2y < 20 
5x + 2 y < 10 
x > 0, y > 0 

Maximize f = x — 4y 

x - y > -4 
4x + 5y < 45 
5x — 2y < 20 
5x +2y > 10 

x > 0, y is unrestricted in sign 

3.30 Minimize / = x—4 y 

subject to 

x - y > -4 
4x + 5y < 45 
5x — 2y < 20 
5x + 2y > 10 
x > 0, y > 0 


3.29 

subject to 


3.28 

subject to 


3.31 Rework Problem 3.30 by changing the objective to Maximize f = x — 4y. 

3.32 Minimize / = 4x + 5y 
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subject to 

lOx + y > 10 
5x + 4y >20 
3x + 7y> 21 
x + I2y > 12 
x > 0, y > 0 

changing the objective to Maximize / = 4x + 5y. 
changing the objective to Minimize f = 6x + 2y. 
Minimize f = 6x + 2y 


3.33 Rework Problem 3.32 by 

3.34 Rework Problem 3.32 by 

3.35 


subject to 


lOx + y > 10 
5x + 4 y > 20 
3x+7y> 21 
x + 12y > 12 

x and y are unrestricted in sign 


3.36 Minimize / = 5x + 2 y 
subject to 

3x + 4 y < 24 
x - y <3 
x + 4y > 4 
3x + y >3 
x > 0, y > 0 

3.37 Rework Problem 3.36 by changing the objective to Maximize f = 5x + 2y. 

3.38 Rework Problem 3.36 when x is unrestricted in sign and y > 0. 

3.39 


Maximize / = 5x + 2y 
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subject to 

3x + 4y < 24 
x - y <3 
x + 4y < 4 
3x + y >3 
x > 0, y > 0 

3.40 Maximize / = 3* + 2y 
subject to 

9x + lOy < 330 
2 lx — 4y > -36 
x + 2y > 6 
6x — y < 72 
3x + y < 54 
x > 0, y > 0 

3.41 Rework Problem 3.40 by changing the constraint x + 2y > 6 to x + 2y < 6. 

3.42 Maximize / = 3x + 2y 
subject to 

9x + lOy < 330 
2 lx — 4y > -36 
x + 2y < 6 
6x — y <72 
3x + y > 54 
x > 0, y > 0 

3.43 Maximize / = 3x + 2y 

2 lx — 4y > -36 
x + 2y > 6 
6x — y < 72 
x > 0, y > 0 


subject to 
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3.44 Reduce the system of equations 

2*1 + 3x2 — 2*3 — 7*4 = 2 
x\ + *2 — X3 + 3*4 = 12 
*1 — *2 + *3 + 5x 4 = 8 

into a canonical system with xi, *2, and X3 as basic variables. From this derive all other 
canonical forms. 

3 Maximize / = 240xi + 104*2 + 6OX3 + 19*4 

subject to 

20xi + 9*2 + 6x3 + X4 < 20 
lOxj + 4*2 + 2.X3 + X 4 < 10 

Xi >0, i = 1 to 4 

Find all the basic feasible solutions of the problem and identify the optimal solution. 

3.46 A progressive university has decided to keep its library open round the clock and gathered 
that the following number of attendants are required to reshelve the books: 


Time of day 
(hours) 

Minimum number of 
attendants required 

0-4 

4 

4-8 

7 

8-12 

8 

12-16 

9 

16-20 

14 

20-24 

3 


If each attendant works eight consecutive hours per day, formulate the problem of finding 
the minimum number of attendants necessary to satisfy the requirements above as a LP 
problem. 

3.47 A paper mill received an order for the supply of paper rolls of widths and lengths as 
indicated below: 


Number of rolls 

Width of roll 

Length 

ordered 

(m) 

(m) 

1 

6 

100 

1 

8 

300 

1 

9 

200 


The mill produces rolls only in two standard widths, 10 and 20 m. The mill cuts the 
standard rolls to size to meet the specifications of the orders. Assuming that there is no 
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limit on the lengths of the standard rolls, find the cutting pattern that minimizes the trim 
losses while satisfying the order above. 

3.48 Solve the LP problem stated in Example 1 .6 for the following data: I = 2 m, 
W\ = 3000 N, W 2 = 2000 N, W 3 = 1000 N, and w\ = w 2 = w 3 = 200 N. 

3.49 Find the solution of Problem 1.1 using the simplex method. 

3.50 Find the solution of Problem 1.15 using the simplex method. 

3.51 Find the solution of Example 3.1 using (a) the graphical method and (b) the simplex 
method. 

3.52 In the scaffolding system shown in Fig. 3.17, loads x\ and x 2 are applied on beams 2 and 
3, respectively. Ropes A and B can carry a load of W\ = 300 lb each; the middle ropes, 
C and D, can withstand a load of W 3 = 200 lb each, and ropes E and F are capable 
of supporting a load W 3 = 100 lb each. Formulate the problem of finding the loads x\ 
and X 2 and their location parameters x 3 and X 4 to maximize the total load carried by the 
system, x\ + X 2 , by assuming that the beams and ropes are weightless. 

3.53 A manufacturer produces three machine parts. A, B, and C. The raw material costs 
of parts A, B, and C are $5, $10, and $15 per unit, and the corresponding prices of 
the finished parts are $50, $75, and $100 per unit. Part A requires turning and drilling 
operations, while part B needs milling and drilling operations. Part C requires turning 
and milling operations. The number of parts that can be produced on various machines 
per day and the daily costs of running the machines are given below: 



Number of parts that can be produced on 

Machine part 

Turning lathes 

Drilling machines 

Milling machines 

A 

15 

15 


B 


20 

30 

C 

25 


10 

Cost of running the 
machines per day 

$250 

$200 

$300 


Formulate the problem of maximizing the profit. 




12 ft 


xi 


2 ft 


1 0 - X4 ► J- 


D 


■ X4 ■ 


2 ft 


E > 

2 F 


* 3 — - 


B 

Beam 1 
Beam 2 
Beam 3 


Figure 3.17 Scaffolding system with three beams. 
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Solve Problems 3.54-3.90 by the simplex method. 


3.54 

Problem 1.22 

3.55 

Problem 1.23 

3.56 

Problem 1.24 

3.57 

Problem 1.25 

3.58 

Problem 3.7 

3.59 

Problem 3.12 

3.60 

Problem 3.13 

3.61 

Problem 3.14 

3.62 

Problem 3.15 

3.63 

Problem 3.16 

3.64 

Problem 3.17 

3.65 

Problem 3.18 

3.66 

Problem 3.19 

3.67 

Problem 3.20 

3.68 

Problem 3.21 

3.69 

Problem 3.22 

3.70 

Problem 3.23 

3.71 

Problem 3.24 

3.72 

Problem 3.25 

3.73 

Problem 3.26 

3.74 

Problem 3.27 

3.75 

Problem 3.28 

3.76 

Problem 3.29 

3.77 

Problem 3.30 

3.78 

Problem 3.31 

3.79 

Problem 3.32 

3.80 

Problem 3.33 

3.81 

Problem 3.34 

3.82 

Problem 3.35 

3.83 

Problem 3.36 

3.84 

Problem 3.37 
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3.85 Problem 3.38 

3.86 Problem 3.39 

3.87 Problem 3.40 

3.88 Problem 3.41 

3.89 Problem 3.42 

3.90 Problem 3.43 

3.91 The temperatures measured at various points inside a heated wall are given below: 


Distance from the heated surface as a 
percentage of wall thickness, x t 

0 

20 

40 

60 

80 

100 

Temperature, (°C) 

400 

350 

250 

175 

100 

50 


It is decided to use a linear model to approximate the measured values as 

t = a + bx (1) 

where t is the temperature, x the percentage of wall thickness, and a and b the coefficients 
that are to be estimated. Obtain the best estimates of a and b using linear programming 
with the following objectives. 

(a) Minimize the sum of absolute deviations between the measured values and those 
given by Eq. (1): E,|a + bx t — f, |. 

(b) Minimize the maximum absolute deviation between the measured values and those 
given by Eq. (1): 

Max | a + bxi — 1 


3.92 A snack food manufacturer markets two kinds of mixed nuts, labeled A and B. Mixed 
nuts A contain 20% almonds, 10% cashew nuts, 15% walnuts, and 55% peanuts. Mixed 
nuts B contain 10% almonds, 20% cashew nuts, 25% walnuts, and 45% peanuts. A 
customer wants to use mixed nuts A and B to prepare a new mix that contains at least 
41b of almonds, 5 lb of cashew nuts, and 6 lb of walnuts, for a party. If mixed nuts A 
and B cost $2.50 and $3.00 per pound, respectively, determine the amounts of mixed 
nuts A and B to be used to prepare the new mix at a minimum cost. 

3.93 A company produces three types of bearings, B\, Bi, and B 3 , on two machines, A\ 
and A 2 . The processing times of the bearings on the two machines are indicated in the 
following table: 



Processing time (min) for bearing: 


Machine 

B\ B 2 

S 3 

A, 

10 6 

12 


8 4 

4 
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The times available on machines A\ and Ai per day are 1200 and 1000 minutes, respec- 
tively. The profits per unit of Si, Bi, and S 3 are $4, $2, and $3, respectively. The 
maximum number of units the company can sell are 500, 400, and 600 for Si, S 2 , and 
S 3 , respectively. Formulate and solve the problem for maximizing the profit. 

3.94 Two types of printed circuit boards A and B are produced in a computer manufacturing 
company. The component placement time, soldering time, and inspection time required 
in producing each unit of A and B are given below: 



Time required per unit (min) for: 


Circuit board 

Component placement 

Soldering 

Inspection 

A 

16 

10 

4 

B 

10 

12 

8 


If the amounts of time available per day for component placement, soldering, and inspec- 
tion are 1500, 1000, and 500 person-minutes, respectively, determine the number of units 
of A and B to be produced for maximizing the production. If each unit of A and B 
contributes a profit of $10 and $15, respectively, determine the number of units of A 
and B to be produced for maximizing the profit. 

3.95 A paper mill produces paper rolls in two standard widths; one with width 20 in. and 
the other with width 50 in. It is desired to produce new rolls with different widths as 
indicated below: 


Width (in.) 

Number of rolls required 

40 

150 

30 

200 

15 

50 

6 

100 


The new rolls are to be produced by cutting the rolls of standard widths to minimize 
the trim loss. Formulate the problem as an LP problem. 

3.96 A manufacturer produces two types of machine parts, P\ and P 2 , using lathes and 
milling machines. The machining times required by each part on the lathe and the 
milling machine and the profit per unit of each part are given below: 



Machine time (hr) required by 



each unit on: 


Machine part 

Lathe Milling machine 

Cost per unit 

Pi 

5 2 

$200 

P 2 

4 4 

$300 


If the total machining times available in a week are 500 hours on lathes and 400 hours 
on milling machines, determine the number of units of Pi and P 2 to be produced per 
week to maximize the profit. 
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3.97 A bank offers four different types of certificates of deposits (CDs) as indicated below: 


CD type 

Duration (yr) 

Total interest at maturity (%) 

1 

0.5 

5 

2 

1.0 

7 

3 

2.0 

10 

4 

4.0 

15 


If a customer wants to invest $50,000 in various types of CDs, determine the plan that 
yields the maximum return at the end of the fourth year. 

3.98 The production of two machine parts A and B requires operations on a lathe (L), a 
shaper ( S ), a drilling machine (D), a milling machine (M), and a grinding machine 
(G). The machining times required by A and B on various machines are given below. 




Machine time required (hours per unit) on: 


Machine part 

L 

S 

D 

M 

G 

A 

0.6 

0.4 

0.1 

0.5 

0.2 

B 

0.9 

0.1 

0.2 

0.3 

0.3 


The number of machines of different types available is given by L : 10, S : 3, D : 4, M: 
6, and G: 5. Each machine can be used for 8 hours a day for 30 days in a month. 

(a) Determine the production plan for maximizing the output in a month 

(b) If the number of units of A is to be equal to the number of units of B, find the 
optimum production plan. 

3.99 A salesman sells two types of vacuum cleaners, A and B. He receives a commission of 
20% on all sales, provided that at least 10 units each of A and B are sold per month. 
The salesman needs to make telephone calls to make appointments with customers and 
demonstrate the products in order to sell the products. The selling price of the products, 
the average money to be spent on telephone calls, the time to be spent on demonstrations, 
and the probability of a potential customer buying the product are given below: 



Selling 

Money to be spent on 

Time to be spent in 

Probability of a 

Vacuum 

price per 

telephone calls to find 

demonstrations to a 

potential customer 

cleaner 

unit 

a potential customer 

potential customer (hr) 

buying the product 

A 

$250 

$3 

3 

0.4 

B 

$100 

$1 

1 

0.8 


In a particular month, the salesman expects to sell at most 25 units of A and 45 units of 
B. If he plans to spend a maximum of 200 hours in the month, formulate the problem 
of determining the number of units of A and B to be sold to maximize his income. 

3.100 An electric utility company operates two thermal power plants, A and B, using three 
different grades of coal, C\, C 2 , and C3. The minimum power to be generated at plants A 
and B is 30 and 80 MWh, respectively. The quantities of various grades of coal required 
to generate 1 MWh of power at each power plant, the pollution caused by the various 
grades of coal at each power plant, and the costs of coal are given in the following table: 
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Coal type 

Quantity of coal 
required to generate 1 
MWh at the power 
plant (tons) 

Pollution 
caused at 
power plant 

Cost of coal 
at power 
plant 

A 

B 

A 

B 

A 

B 

Ci 

2.5 

1.5 

1.0 

1.5 

20 

18 

C 2 

1.0 

2.0 

1.5 

2.0 

25 

28 

C 3 

3.0 

2.5 

2.0 

2.5 

18 

12 


Formulate the problem of determining the amounts of different grades of coal to be used 
at each power plant to minimize (a) the total pollution level, and (b) the total cost of 
operation. 

3.101 A grocery store wants to buy five different types of vegetables from four farms in a 
month. The prices of the vegetables at different farms, the capacities of the farms, and 
the minimum requirements of the grocery store are indicated in the following table: 




Price ($/ton) of vegetable type 


Maximum (of all 

Farm 

1 

(Potato) 

2 

(Tomato) 

3 

(Okra) 

4 

(Eggplant) 

5 

(Spinach) 

types combined) 
they can supply 

1 

200 

600 

1600 

800 

1200 

180 

2 

300 

550 

1400 

850 

1100 

200 

3 

250 

650 

1500 

700 

1000 

100 

4 

Minimum amount 

150 

500 

1700 

900 

1300 

120 

required (tons) 

100 

60 

20 

80 

40 



Formulate the problem of determining the buying scheme that corresponds to a 
minimum cost. 

3.102 A steel plant produces steel using four different types of processes. The iron ore, coal, 
and labor required, the amounts of steel and side products produced, the cost information, 
and the physical limitations on the system are given below: 


Process type 

Iron ore 
required 
(tons/day) 

Coal 

required 

(tons/day) 

Labor required 
(person-days) 

Steel 

Produced 

(tons/day) 

Side 

products 

Produced 

(tons/day) 

1 

5 

3 

6 

4 

1 

2 

8 

5 

12 

6 

2 

3 

3 

2 

5 

2 

1 

4 

10 

7 

12 

6 

4 

Cost 

$50/ton 

$ 10/ton 

$ 150/person-day 

$350/ton 

$ 100/ton 

Limitations 

600 tons 

250 tons 

No limita- 

All steel 

Only 200 


available 

available 

tions on 

produced 

tons can 


per 

per 

availability 

can be 

be sold 


month 

month 

of labor 

sold 

per month 
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3.103 

3.104 

3.105 

3.106 

3.107 


Assuming that a particular process can be employed for any number of days in a 
30-day month, determine the operating schedule of the plant for maximizing the profit. 

Solve Example 3.7 using MATLAB (simplex method). 

Solve Problem 3.12 using MATLAB (simplex method). 

Solve Problem 3.24 using MATLAB (simplex method). 

Find the optimal solution of the LP problem stated in Problem 3.45 using MATLAB 
(simplex method). 

Find the optimal solution of the LP problem described in Problem 3.101 using MATLAB. 
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Linear Programming II: 
Additional Topics and Extensions 

4.1 INTRODUCTION 

If a LP problem involving several variables and constraints is to be solved by using the 
simplex method described in Chapter 3, it requires a large amount of computer storage 
and time. Some techniques, which require less computational time and storage space 
compared to the original simplex method, have been developed. Among these tech- 
niques, the revised simplex method is very popular. The principal difference between 
the original simplex method and the revised one is that in the former we transform all 
the elements of the simplex tableau, while in the latter we need to transform only the 
elements of an inverse matrix. Associated with every LP problem, another LP problem, 
called the dual , can be formulated. The solution of a given LP problem, in many cases, 
can be obtained by solving its dual in a much simpler manner. 

As stated above, one of the difficulties in certain practical LP problems is that the 
number of variables and/or the number of constraints is so large that it exceeds the 
storage capacity of the available computer. If the LP problem has a special structure, 
a principle known as the decomposition principle can be used to solve the problem 
more efficiently. In many practical problems, one will be interested not only in finding 
the optimum solution to a LP problem, but also in finding how the optimum solution 
changes when some parameters of the problem, such as cost coefficients change. Hence 
the sensitivity or postoptimality analysis becomes very important. 

An important special class of LP problems, known as transportation problems, 
occurs often in practice. These problems can be solved by algorithms that are more 
efficient (for this class of problems) than the simplex method. Karmarkar's method is 
an interior method and has been shown to be superior to the simplex method of Dantzig 
for large problems. The quadratic programming problem is the best-behaved nonlinear 
programming problem. It has a quadratic objective function and linear constraints and 
is convex (for minimization problems). Hence the quadratic programming problem can 
be solved by suitably modifying the linear programming techniques. All these topics 
are discussed in this chapter. 


4.2 REVISED SIMPLEX METHOD 

We notice that the simplex method requires the computing and recording of an entirely 
new tableau at each iteration. But much of the information contained in the tableau is 
not used; only the following items are needed. 
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1. The relative cost coefficients cj to compute* 


c s — min(c j) (4.1) 

c s determines the variable x s that has to be brought into the basis in the next 
iteration. 

2 . By assuming that c s < 0, the elements of the updated column 


a is 



&ms 


and the values of the basic variables 


x fi = 


bj 

bi 


h 

u m 


have to be calculated. With this information, the variable x r that has to be 
removed from the basis is found by computing the quantity 


b r 

@rs 


min 

a is > 0 



(4.2) 


and a pivot operation is performed on a rs . Thus only one nonbasic column A s of 
the current tableau is useful in finding x r . Since most of the linear programming 
problems involve many more variables (columns) than constraints (rows), con- 
siderable effort and storage is wasted in dealing with the A ; for j / .s. Hence 
it would be more efficient if we can generate the modified cost coefficients cj 
and the column A s , from the original problem data itself. The revised simplex 
method is used for this purpose; it makes use of the inverse of the current basis 
matrix in generating the required quantities. 


Theoretical Development Although the revised simplex method is applicable for 
both phase I and phase II computations, the method is initially developed by considering 
linear programming in phase II for simplicity. Later, a step-by-step procedure is given 
to solve the general linear programming problem involving both phases I and II. 

Let the given linear programming problem (phase II) be written in column 
form as 


Minimize 


/(X) = C\X\ + C1X2 4 V C n x n 


(4.3) 


^The modified values of £>,- , ay, and Cj are denoted by overbars in this chapter (they were denoted by primes 
in Chapter 3). 
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subject to 

AX = A \X{ + A 2 X 2 + • • ■ + A n x n — b 

X > 0 

rax 1 nx 1 


where the 7 th column of the coefficient matrix A is given by 


A, = 

m x 1 


dXj 

a 2j 


Cl m j 

Assuming that the linear programming problem has a solution, let 


(4.4) 

(4.5) 


8 — I A j 1 A j2 ■ ■ Aj m ] 


be a basis matrix with 


x n 


' c n 

X j2 

and C b — 

c r- 


mx 1 


Xj m 


Cjm 


representing the corresponding vectors of basic variables and cost coefficients, respec- 
tively. If X b is feasible, we have 


X B = B 'b = b>0 


As in the regular simplex method, the objective function is included as the (m + l)th 
equation and — / is treated as a permanent basic variable. The augmented system can 
be written as 


J2 p j x j + p «+t(-/) = A (4-6) 

r'=t 

where 



a\j 


0 


b x ' 


a 2j 


0 


b 2 

P J = 

&mj 

• , j = 1 to n, P „ + 1 = ■ 

0 

and q = 

bm 


. c j . 


1 


0 


Since B is a feasible basis for the system of Eqs. (4.4), the matrix D defined by 
D = [P/l Py2 ••• P/m P n +l] = T ? 

m+lxm+1 |_ x *b L _ 


will be a feasible basis for the augmented system of Eqs. (4.6). The inverse of D can 
be found to be 


D” 1 


B 1 0 

— C^B- 1 1. 
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Definition. The row vector 


c]B 


-1 _ _T _ 
= n = 


7T\ 

TC2 


71, y 


(4.7) 


is called the vector of simplex multipliers relative to the / equation. If the computations 
correspond to phase I, two vectors of simplex multipliers, one relative to the / equation, 
and the other relative to the w equation are to be defined as 


71 * — CflB 


T D — 1 


^2 


7Z n 


r T = dlB~ l 


<Tl 

0-2 


By premultiplying each column of Eq. (4.6) by D ', we obtain the following canonical 
system of equations^: 


where 



X jm 

-f 


b\ 

bi 


E a i x j = 

jnonbasic 

bm 

E CjXj = -fo 

j nonbasic 



= d ‘P/ 


B 1 0 

— n T 1 



From Eq. (4.8), the updated column Ay can be identified as 


B 'A 


(4.8) 


(4.9) 


Premultiplication of P jXj by D 1 gives 


D-‘P J*j = 


B" 1 0 

— ji t 1 

B ‘A j 

-n T A ; - + C j 


ID-'P^y 


if xj is a basic variable 
if Xj is not a basic variable 
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and the modified cost coefficient Cj as 

Cj — Cj — jt t A/ (4.10) 


Equations (4.9) and (4.10) can be used to perform a simplex iteration by generating 
A j and cj from the original problem data, A ; and cj. 

Once A ; and cj are computed, the pivot element a rs can be identified by using 
Eqs. (4.1) and (4.2). In the next step, P s is introduced into the basis and P /r is removed. 
This amounts to generating the inverse of the new basis matrix. The computational 
procedure can be seen by considering the matrix: 


ais 

P /' 1 Py'2 ’ P jm P/7+1 ®2 ‘ &m + 1 @2s 

■ . ' . ' 

D I 


m + 1 x m + 1 m + 1 x m + 1 a 


ms 

C s 


(4.11) 


where e, is a (m + 1 )-dimensional unit vector with a one in the ith row. Premultipli- 
cation of the above matrix by D -1 yields 

6i ©2 © ' ©7z + l D U\s 

~~ V ^ 

I m +1 x m + 1 a 2 S 


m + lx m + 1 


Cfs 

Pivot 

element 


(4.12) 


@ms 

C s 

m + lxl 


By carrying out a pivot operation on a rs , this matrix transforms to 

[[ei ez ■ ■ ■ ©_i j8 e^+i • ■ ■ ©„+i] ©.] (4.13) 

where all the elements of the vector (i are, in general, nonzero and the second partition 
gives the desired matrix D It can be seen that the first partition (matrix I ) is included 


+ This can be verified by comparing the matrix of Eq. (4.13) with the one given in Eq. (4.11). The columns 
corresponding to the new basis matrix are given by 

©new = [Pjl Pj2 ' ' ' Pj r _ i Ps P/ r+ i ’ ’ ' P/i + ll 

brought in 
place of P r 

These columns are modified and can be seen to form a unit matrix in Eq. (4.13). The sequence of pivot 
operations that did this must be equivalent to multiplying the original matrix, Eq. (4.11), by D“ e * w . Thus the 
second partition of the matrix in Eq. (4.13) gives the desired D^ ew . 


182 


Linear Programming II: Additional Topics and Extensions 


only to illustrate the transformation, and it can be dropped in actual computations. Thus 
in practice, we write the m + 1 x m + 2 matrix 


^ I v 

a 2s 


D 1 



®ms 

C s 


and carry out a pivot operation on a rs . The first m + 1 columns of the resulting matrix 
will give us the desired matrix D“ e ^,. 


Procedure. The detailed iterative procedure of the revised simplex method to solve 
a general linear programming problem is given by the following steps. 

1. Write the given system of equations in canonical form, by adding the artificial 
variables x„+i, x n+2 , . . . , x n+m , and the infeasibility form for phase I as shown 
below: 

anXi+a l 2 x 2 -\ b a\ n x„ + x„ +] =b i 

+ C122X2 H b Cl2nXn +X n +2 = b 2 


@m I X | -p Cl m 2X2 + ' • ' + Ct mt iX n — b m 

C\X\+C 2 X2-\ h c n x n - / =0 

d\X\ + d 2 X2 + ■ ■ ■ + d n x n —w — —wq 

(4.14) 

Here the constants h,-, i = 1 to m, are made nonnegative by changing, if nec- 
essary, the signs of all terms in the original equations before the addition of 
the artificial variables x / = I to m. Since the original infeasibility form is 
given by 


W — X n+ i + X n+2 + ■ • • + Xn+m (4.15) 

the artificial variables can be eliminated from Eq. (4.15) by adding the first m 
equations of Eqs. (4.14) and subtracting the result from Eq. (4.15). The resulting 
equation is shown as the last equation in Eqs. (4.14) with 

m m 

dj — and wo — (4.16) 

! = 1 ( = 1 

Equations (4.14) are written in tableau form as shown in Table 4.1. 

2 . The iterative procedure (cycle 0) is started with x„ + i, x n+2 , . . . , x n+m , — /, and 
— w as the basic variables. A tableau is opened by entering the coefficients of 
the basic variables and the constant terms as shown in Table 4.2. The starting 
basis matrix is, from Table 4.1, B = I, and its inverse B 1 = [/) l; | can also be 


c 

.2 

3 

CT" 

tu 


c/3 


c 

o 

U 


cd 


<N <N 
O "^3 


Cj ’"Q 
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Table 4.2 Tableau at the Beginning of Cycle 0 





Columns of the canonical form 



Value of the 


Basic variables 

-*7I + 1 

-*71+2 

-*7i+r 

-*7!+m 

-/ 

—w 

basic variable 

x s a 

-*•«+ 1 

1 






b[ 


%n+ 2 


1 





b 2 


Xn+r 



l 




b r 


%n+m 




1 



b m 



<— 


Inverse of the basis < 






-f 

0 

0 

0 

0 

1 


0 









m 


—w 

0 

0 

0 

0 


l 

- wo = -J^bi 









1 = 1 



“This column is blank at the beginning of cycle 0 and filled up only at the end of cycle 0. 


seen to be an identity matrix in Table 4.2. The rows corresponding to —/ and 
— w in Table 4.2 give the negative of simplex multipliers : r,- and cr, (i — 1 to m ), 
respectively. These are also zero since c# = d# = 0 and hence 

rr T = C TB“ 1 = 0 

In general, at the start of some cycle k {k — 0 to start with) we open a tableau 
similar to Table 4.2, as shown in Table 4.4. This can also be interpreted as 
composed of the inverse of the current basis, B 1 = [/3 (/ J, two rows for the 
simplex multipliers tt, and <r,-, a column for the values of the basic variables in 
the basic solution, and a column for the variable x s . At the start of any cycle, 
all entries in the tableau, except the last column, are known. 

3. The values of the relative cost factors dj (for phase I) or cj (for phase II) are 
computed as 

dj — dj — ff T A j 

Cj = c j - 77-1 A ./ 

and entered in a tableau form as shown in Table 4.3. For cycle 0, ar T — 0 and 
hence dj = dj. 

4. If the current cycle corresponds to phase I, find whether all dj > 0. If all 
dj > 0 and Wo > 0, there is no feasible solution to the linear programming 
problem, so the process is terminated. If all dj > 0 and wq = 0, the current basic 
solution is a basic feasible solution to the linear programming problem and hence 
phase II is started by (a) dropping all variables xj with dj >0, (b) dropping 
the w row of the tableau, and (c) restarting the cycle (step 3) using phase 
II rules. 
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Table 4.3 Relative Cost Factor dj or c, 


Variable xj 

Cycle number xi X 2 ■ ■ ■ x„ x n +i x„ + i ■ ■ ■ x n+m 


Phase 


0 

1 


I 


Phase II 


/ + 1 
1 + 2 


d\ d2 ■ ■ ■ d n 0 0 

Use the values of er,- (if phase I) or : r,- (if phase II) of the 
current cycle and compute 

dj = dj - (aiaij + a 2 a 2 j H h a m a mj ) 

or 

Cj = Cj — + 71202] + ■ ■ ■ + 7l m a m j) 

Enter d ,■ or cj in the row corresponding to the current cycle 
and choose the pivot column s such that d s = min dj 
(if phase I) or c s = min cy (if phase II) 


0 


Table 4.4 Tableau at the Beginning of Cycle k 



Columns of the original canonical form 

Value of the basic 


Basic variable 

•*71+1 

%n+m 

3 

1 

1 

variable 

X s a 


[Ayl 

— [^i,n+ y ] 





<— Inverse of the basis —*■ 









m 

x i i 

Pi i 

’ ‘ ‘ film 


b\ 

Q\s — 22 fili&is 
i = l 






m 

x jr 

Prl 

firm 


b r 

&rs = 22 firi^is 
i = l 






m 

X jm 

Pml 

’ ‘ ‘ fimm 


b m 

@ms — 22 fimi&is 
i'=l 






m 

-f 

-JTi 


1 

~fo 

Cs — C s 2. 2 ■fl'i^is 


(-Jtj 

= +Cn+j) 



i'=l 






m 

—uJ 

-G\ 

’ ’ ' — a m 

I 

-wo 

ds — ds 22 &i^is 


(~CTj 

= +d n +j) 



i = l 


“This column is blank at the start of cycle k and is filled up only at the end of cycle k. 


If some d j <0, choose x s as the variable to enter the basis in the next 
cycle in place of the present rth basic variable (r will be determined later) such 
that 

d s — min(f/ ; - < 0) 

On the other hand, if the current cycle corresponds to phase II, find whether 
all cj > 0. If all cj > 0, the current basic feasible solution is also an optimal 
solution and hence terminate the process. If some cy < 0, choose x s to enter 
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the basic set in the next cycle in place of the rth basic variable (r to be found 
later), such that 

c s = mi nicy < 0) 

5 . Compute the elements of the x s column from Eq. (4.9) as 

A, = B"% = fajAs 

that is, 

tfls = P\\a\s + ^12«2 s + • • • + @lm a ms 

@2s — f J> 2 l a \ \ T @22(12 s + ' ' ' + @2 m®ms 


®ms — ~h @m2®2s T ' ' ' T" @mm®ms 

and enter in the last column of Table 4.2 (if cycle 0) or Table 4.4 (if cycle k). 

6. Inspect the signs of all entries a is , i = I to m. If all &is — 0, the class of 
solutions 

x s > 0 arbitrary 

Xji — bi — aj S ■ x s if Xji is a basic variable, and Xj — 0 il' x ; - is a nonbasic 
variable (j ^ s), satisfies the original system and has the property 

f — f o + c s x s — > — oo as -> +oo 

Hence terminate the process. On the other hand, if some o) v >0, select the 
variable x r that can be dropped in the next cycle as 

b r - 

— = _min ( bi/dis ) 

dpg Cl is > 0 

In the case of a tie, choose r at random. 

7. To bring x s into the basis in place of x r , carry out a pivot operation on the 
element a rs in Table 4.4 and enter the result as shown in Table 4.5. As usual, 
the last column of Table 4.5 will be left blank at the beginning of the current 
cycle k + 1. Also, retain the list of basic variables in the hrst column of Table 4.5 
the same as in Table 4.4, except that j r is changed to the value of s determined 
in step 4. 

8. Go to step 3 to initiate the next cycle, k + 1 . 

Example 4.1 

Maximize F — x\ + 2x2 + -G 


subject to 


2xi + X 2 — X 3 <2 
— 2xi + X 2 — 5x3 > — 6 
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Table 4.5 Tableau at the Beginning of Cycle k + 1 



Columns of the canonical form 



Basic variables 

%n+ 1 

Xn+m f 

—w Value of the basic variable 

x s a 

Xj 1 

/Si 1 - an/8*! 

film ~ a\ s fi rm 

b\ ci\ s b r 


x s 


B* 

r'rm 

b* 


X jm 


’ ' ’ fimm & ms firm 

bm &msb r 


-f 

-7Tl - C s ) 3*j 

TTra Csfi rt n 1 

-fo-CsX 


—w 

-o\ - d s j 3 r * 

~ a m — dsBrm 

1 —u7o - d s b*. 




B* = ^2-{i = 1 to m) and 

d rs 

fr* _ 

d r s 



“This column is blank at the start of the cycle. 


4xi + X2 + *3 < 6 
X\ >0, X2 > 0, X 3 > 0 


SOLUTION This problem can be stated in standard form as (making all the constants 
bj positive and then adding the slack variables): 

Minimize 

/ = —x\ — 2x2 — X3 

subject to 

2X[ + X2 — X3 + X4 —2 

2 xi — X2+ 5X3 + X5 —6 

4 x\ + X2+ X3 + Xg = 6 

Xi > 0, i = 1 to 6 

where X4, X5, and xg are slack variables. Since the set of equations (Ei) are in canonical 
form with respect to X4, x$, and xg, x, — 0 (i — 1 , 2 , 3 ) and X4 = 2 , X5 = 6, and xg = 6 
can be taken as an initial basic feasible solution and hence there is no need for phase I. 

Step 1 All the equations (including the objective function) can be written in canonical 
form as 

2xi + X2 — X3 + X4 = 

2 xi — X2 + 5X3 +X5 = 

4xi + X 2 + x 3 + Xg = 

— Xi - 2x2 — X3 — / = 

These equations are written in tableau form in Table 4 . 6 . 



(Ei) 


(E 2 ) 
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Table 4.6 Detached Coefficients of the Original System 




Admissible variables 


-/ 

Constants 

X\ 

^2 

*3 

X 4 

x 5 

x 6 

2 

1 

-1 

1 

0 

0 


2 

2 

-1 

5 

0 

1 

0 


6 

4 

1 

1 

0 

0 

1 


6 

-1 

-2 

-1 

0 

0 

0 

1 

0 


Table 4.7 Tableau at the Beginning of Cycle 0 



Columns of the canonical form 

Value of the basic 




Basic variables 

X 4 

*5 

x 6 

-/ 

variable (constant) 


xi a 


*4 

1 

0 

0 

0 

2 


a 42 = 1 








Pivot element 

*5 

0 

1 

0 

0 

6 

U 52 = —1 

*6 

0 

0 

1 

0 

6 


fl62 = 1 



Inverse of the basis 

= IM 






-/ 

0 

0 

0 

1 

0 


C2 = -2 



"This column is entered at the end of step 5. 


Step 2 The iterative procedure (cycle 0) starts with X 4 , X 5 , jtg, and — / as basic vari- 
ables. A tableau is opened by entering the coefficients of the basic variables 
and the constant terms as shown in Table 4.7. Since the basis matrix is B = 
I, its inverse B 1 = [frj] = I. The row corresponding to —f in Table 4.7 
gives the negative of simplex multipliers m, i — 1,2,3. These are all zero 
in cycle 0. The entries of the last column of the table are, of course, not yet 
known. 

Step 3 The relative cost factors cj are computed as 

Cj — Cj — ?r T A j — Cj, j — 1 to 6 
since all 7 r,- are zero. Thus 

Ci=Ci = -1 
c 2 = c 2 = -2 
C 3 = c 3 = -1 

C4 = C4 = 0 

C5 = c 5 = 0 
Cf, = c 6 — 0 

These cost coefficients are entered as the first row of a tableau (Table 4.8). 
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Table 4.8 Relative Cost Factors cj 


Cycle number 



Variable xj 




x\ 

X 2 

x 3 

X 4 

*5 

*6 

Phase II 







Cycle 0 

-1 

E3 

-1 

0 

0 

0 

Cycle 1 

3 

0 

EE 

2 

0 

0 

Cycle 2 

6 

0 

0 

11 

4 

3 

4 

0 


Step 4 Find whether all cj > 0 for optimality. The present basic feasible solution is 
not optimal since some cj are negative. Flence select a variable x s to enter 
the basic set in the next cycle such that c s — m i n (c/ < 0) = c 2 in this case. 
Therefore, X 2 enters the basic set. 

Step 5 Compute the elements of the x s column as 


A, = LA-./JA, 


where [/3 (; ] is available in Table 4.7 and A v in Table 4.6. 


a 2 = ia 2 = 



These elements, along with the value of c 2 , are entered in the last column of 
Table 4.7. 

Step 6 Select a variable ( x r ) to be dropped from the current basic set as 


In this case, 



b 4 _ 2 

CI42 1 



<262 1 


Therefore, x r — X 4 . 


Step 7 To bring x 2 into the basic set in place of X 4 , pivot on a rs = a 42 in Table 4.7. 
Enter the result as shown in Table 4.9, keeping its last column blank. Since a 
new cycle has to be started, we go to step 3. 

Step 3 The relative cost factors are calculated as 


Cj — Cj (jl \&\j + 7l2d2 j “I” ^3^3 j) 
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Table 4.9 Tableau at the Beginning of Cycle 1 



Columns of the original canonical form 

Value of the basic 




Basic variables 

*4 *5 


x 6 

-f 

variable 


x 3 a 


X2 

1 0 


0 

0 

2 

a 23 = —1 


1 1 


0 

0 

8 


a 53 = 4 


X 6 

-1 0 

<— Inverse of the basis = 

Ifii] 

1 

] -*• 

1 

4 

Pivot element 
fl63 = 2 


2 = — 7Tl 0 = — 7T2 

0 = 

= -X 3 

1 

4 


c 3 = —3 



“This column is entered at the end of step 5. 


where the negative values of n i, m, and 7r 3 are given by the row of — / in 
Table 4.9, and c/, ; and c, are given in Table 4.6. Here n\ — —2, no — 0, and 

7T 3 = 0. 

ci = ci — 7Ti n 1 1 = — 1 — (—2) (2) = 3 
c 2 — c 2 - 7 Tia n = - 2 - (- 2 ) ( 1 ) = 0 
C 3 = C 3 — 7T\Cli2 = — 1 — (—2) (—1) = —3 

C 4 — C 4 — Jiiciu — 0 — (—2) ( 1 ) = 2 

C 5 = C 5 - 7T\ai5 — 0 - (-2) (0) = 0 

c 6 = c 6 - ma \ 6 = 0 - (-2) (0) = 0 


Enter these values in the second row of Table 4.8. 

Step 4 Since all Cj are not > 0, the current solution is not optimum. Hence 
select a variable (x s ) to enter the basic set in the next cycle such that 
c s — m i n (Cj < 0) = C 3 in this case. Therefore, x s — x-$. 

Step 5 Compute the elements of the x s column as 

A, = [fa] A, 

where [fijji is available in Table 4.9 and A s in Table 4.6: 


a 23 


1 

0 

o' 


-1 


-1 

«53 

' = 

1 

1 

0 


5 

• = ' 

4 

«63 


-1 

0 

1 


1 


2 


Enter these elements and the value of c s — c 3 = —3 in the last column of 
Table 4.9. 

Step 6 Find the variable (x r ) to be dropped from the basic set in the next cycle as 


b r 

= jnin 

&rs a is ■ > ^ 
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Table 4.10 Tableau at the Beginning of Cycle 2 


Basic variables 

Columns of the original canonical form 
x 4 x 5 x 6 -/ 

Value of the basic 
variable 

x s a 

X2 

5 

4 

1 

4 

0 

0 

4 


x 3 

1 

4 

1 

4 

0 

0 

2 


X6 

6 

4 

2 

4 

1 

1 

0 


-f 

11 

4 

3 

4 

0 

1 

10 



“This column is blank at the beginning of cycle 2. 

Here _ 

bs_ = 8 =2 

a 5 3 4 

^ = 1=2 
«63 2 

Since there is a tie between x$ and xg, we select x r — x 3 arbitrarily. 

Step 7 To bring x 3 into the basic set in place of x 3 , pivot on a rs = « 5 3 in Table 4.9. 
Enter the result as shown in Table 4.10, keeping its last column blank. Since a 
new cycle has to be started, we go to step 3. 

Step 3 The simplex multipliers are given by the negative values of the numbers appear- 
ing in the row of — / in Table 4.10. Therefore, 7zr = — -j-, jt 2 — — and 7 t 3 = 0. 
The relative cost factors are given by 

Cj = cj = -tr T A j 

Then 

cj = ci - JTjfln - n 2 a 21 = -1 - (~x)( 2 ) “ (-?)( 2 ) = 6 

c 2 = c 2 - JT\a n - n 2 a 22 = - 2 - (-^-)(1) - (— 1)(— 1) = 0 

c 3 = c 3 - 7ri«i 3 - 7i 2 a 2 3 = -1 - (— ^)(— 1) - (— 1)(5) = 0 

c 4 = c 4 - niciu - 7i 2 a 2 4 — 0 - (-7-XI) - (— f )( 0) = ^ 

c 5 = c 5 - mai 5 - Ji 2 a 25 = 0 - (-^-)(0) - (-|)(1) = | 

ce = c 6 - mai 6 - 7i 2 a 26 = 0 - (-^-)(0) - (-|)(0) = 0 

These values are entered as third row in Table 4.8. 

Step 4 Since all cj are > 0, the present solution will be optimum. Hence the optimum 
solution is given by 

x 2 = 4 , x 3 - 2 , Ay, = 0 (basic variables) 

X] = X4 = JC5 = 0 (nonbasic variables) 

/min = -10 
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4.3 DUALITY IN LINEAR PROGRAMMING 

Associated with every linear programming problem, called the primal, there is another 
linear programming problem called its dual. These two problems possess very inter- 
esting and closely related properties. If the optimal solution to any one is known, the 
optimal solution to the other can readily be obtained. In fact, it is immaterial which 
problem is designated the primal since the dual of a dual is the primal. Because of 
these properties, the solution of a linear programming problem can be obtained by 
solving either the primal or the dual, whichever is easier. This section deals with 
the primal-dual relations and their application in solving a given linear programming 
problem. 


4.3.1 Symmetric Primal- Dual Relations 

A nearly symmetric relation between a primal problem and its dual problem can be 
seen by considering the following system of linear inequalities (rather than equations). 

Primal Problem. 

0n*i 4-012*2 4 \- a\ n x n > b\ 

021*1 + 022*2 H b 02«*n > b 2 

(4.17) 

0ml*l 4 " 0m 2 *2 4 “ ' ' ’ 4 “ a mn X„ > b ln 

C 1*1 + C2*2 4 b C n X n — f 

(*; > 0, i = 1 to n, and / is to be minimized) 


Dual Problem. As a definition, the dual problem can be formulated by transposing 
the rows and columns of Eq. (4.17) including the right-hand side and the objective 
function, reversing the inequalities and maximizing instead of minimizing. Thus by 
denoting the dual variables as yi,y 2 , ■■ ■ , ,v m , the dual problem becomes 

flllTl + 021V2 4 b 0m iy m < Cl 

012 Vl + 022J2 4 b 0rn 2*m < C2 


(4.18) 


01 hTi 4” a 2n y 2 4" ■ ■ ■ 4~ a mn y m — c n 

biy\ 4- b 2 y 2 4 b b m y m = v 

( yi >0,i = 1 to m, and v is to be maximized) 


Equations (4.17) and (4.18) are called symmetric primal-dual pairs and it is easy to 
see from these relations that the dual of the dual is the primal. 
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4.3.2 General Primal- Dual Relations 

Although the primal-dual relations of Section 4.3.1 are derived by considering a system 
of inequalities in nonnegative variables, it is always possible to obtain the primal-dual 
relations for a general system consisting of a mixture of equations, less than or greater 
than type of inequalities, nonnegative variables or variables unrestricted in sign by 
reducing the system to an equivalent inequality system of Eqs. (4.17). The correspon- 
dence rules that are to be applied in deriving the general primal- dual relations are 
given in Table 4.11 and the primal-dual relations are shown in Table 4.12. 

4.3.3 Primal- Dual Relations When the Primal Is in Standard Form 

If m* — m and n* — n. primal problem shown in Table 4.12 reduces to the standard 
form and the general primal-dual relations take the special form shown in Table 4.13. 
It is to be noted that the symmetric primal-dual relations, discussed in Section 4.3.1, 
can also be obtained as a special case of the general relations by setting m* — 0 and 
n* — n in the relations of Table 4.12. 


Table 4.11 Correspondence Rules for Primal-Dual Relations 


Primal quantity 

Corresponding dual quantity 

Objective function: Minimize C T X 
Variable x t > 0 

Variable x,- unrestricted in sign 
y'th constraint, A y X = bj (equality) 
j th constraint, A / X > bj (inequality) 
Coefficient matrix A = [A i . . . A,„] 
Right-hand-side vector b 
Cost coefficients C 

Maximize Y T b 

/'th constraint Y 'A, < c, (inequality) 
/'th constraint Y 1 A f = c,- (equality) 
y'th variable yj unrestricted in sign 
y'th variable yj > 0 

Coefficient matrix A T = [A i , . . . , A„,] T 
Right-hand-side vector C 
Cost coefficients b 

Table 4.12 Primal-Dual Relations 

Primal problem 

Corresponding dual problem 

n 

Minimize / = ^ qjc,- subject to 

i=i 

m 

Maximize v = ^ yi bi subject to 

i = 1 

n 

Y OjjXj = bi, i = 1,2,..., m* 
j = 1 

m 

Y yi a u = c h J = n * + 1. n * + 2 > 

i'=l 

n 

Y a ij x j > bi, i = m* + 1, m* + 2, 
j = 1 
. . . , m 

where 

x i > 0, i = 1, 2 «*; 

and 

Xi unrestricted in sign, i = n* + 1 , 
n* + 2, . . . , n 

. . . , n 

m 

Yyi^ij <Cj,j = 1 , 2 ,...,«* 

i=i 

where 

yi > 0, i = m* + 1, in* +2 , ,m; 
and 

y t unrestricted in sign, i = 1,2, ... ,m* 
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Table 4.13 Primal-Dual Relations Where m* = m and n* = n 


Primal problem 


Corresponding dual problem 


n 

Minimize / = £ c,x; 

1 = 1 

subject to 

n 

£ ciijXj =bi, i = 1,2, ... ,m 
j = i 
where 

x; > 0, i = 1, 2, . . . , n 
In matrix form 

Minimize / = C T X 
subject to 

AX =b 

where 

X > 0 


m 

Maximize v = £ 

(=1 

subject to 

m 

£ ymj < cj, j = 1,2, ... ,n 

i= 1 

where 

y i is unrestricted in sign, i = 1,2, ■■■ ,m 
In matrix form 

Maximize v = Y r b 
subject to 

A T Y <c 

where 

Y is unrestricted in sign 


£ xample 4.2 Write the dual of the following linear programming problem: 

Maximize / = 50 xi + 100x2 


subject to 

2xi + X 2 < 1250 
2xi + 5x2 < 1000 
2xi + 3x2 < 900 
x 2 < 150 

where 

xi > 0 and X2 > 0 


SOLUTION Let y\, y2, V3, and V4 be the dual variables. Then the dual problem can 
be stated as 

Minimize v — 1250 _yi + 1 000v’2 + 900 >'3 + 1 50\'4 

subject to 

2 >’i + 2^2 + 2 _y 3 > 50 
yi + 5y 2 + 3 j 3 + > 100 

where yi >0, y 2 > 0, J3 >0, and V4 > 0. 


Notice that the dual problem has a lesser number of constraints compared to the 
primal problem in this case. Since, in general, an additional constraint requires more 
computational effort than an additional variable in a linear programming problem, it 
is evident that it is computationally more efficient to solve the dual problem in the 
present case. This is one of the advantages of the dual problem. 
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4.3.4 Duality Theorems 

The following theorems are useful in developing a method for solving LP problems 
using dual relationships. The proofs of these theorems can be found in Ref. [4.10]. 

Theorem 4.1 The dual of the dual is the primal. 

Theorem 4.2 Any feasible solution of the primal gives an / value greater than or at 
least equal to the v value obtained by any feasible solution of the dual. 

Theorem 4.3 If both primal and dual problems have feasible solutions, both have 
optimal solutions and minimum / = maximum v. 

Theorem 4.4 If either the primal or the dual problem has an unbounded solution, the 
other problem is infeasible. 


4.3.5 Dual Simplex Method 

There exist a number of situations in which it is required to find the solution of a 
linear programming problem for a number of different right-hand-side vectors b (,) . 
Similarly, in some cases, we may be interested in adding some more constraints to a 
linear programming problem for which the optimal solution is already known. When 
the problem has to be solved for different vectors b''\ one can always find the desired 
solution by applying the two phases of the simplex method separately for each vector 
b (,) . However, this procedure will be inefficient since the vectors b ' 1 often do not 
differ greatly from one another. Hence the solution for one vector, say, b 1 1 1 may be 
close to the solution for some other vector, say, b i2) . Thus a better strategy is to solve 
the linear programming problem for b ( l) and obtain an optimal basis matrix B. If this 
basis happens to be feasible for all the right-hand-side vectors, that is, if 

B 'b <!) > 0 for all i (4.19) 

then it will be optimal for all cases. On the other hand, if the basis B is not feasible 
for some of the right-hand-side vectors, that is, if 

B *b (r) < 0 for some r (4.20) 


then the vector of simplex multipliers 

x t = C t b B ~ 1 (4.21) 

will form a dual feasible solution since the quantities 

Cj = Cj — 7T T A j > 0 

are independent of the right-hand-side vector b (r> . A similar situation exists when the 
problem has to be solved with additional constraints. 

In both the situations discussed above, we have an infeasible basic (primal) solu- 
tion whose associated dual solution is feasible. Several methods have been proposed. 
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as variants of the regular simplex method, to solve a linear programming problem by 
starting from an infeasible solution to the primal. All these methods work in an iterative 
manner such that they force the solution to become feasible as well as optimal simulta- 
neously at some stage. Among all the methods, the dual simplex method developed by 
Lemke [4.2] and the primal-dual method developed by Dantzig, Ford, and Fulkerson 
[4.3] have been most widely used. Both these methods have the following important 
characteristics: 

1. They do not require the phase I computations of the simplex method. This is a 
desirable feature since the starting point found by phase I may be nowhere near 
optimal, since the objective of phase I ignores the optimality of the problem 
completely. 

2 . Since they work toward feasibility and optimality simultaneously, we can expect 
to obtain the solution in a smaller total number of iterations. 

We shall consider only the dual simplex algorithm in this section. 

Algorithm. As stated earlier, the dual simplex method requires the availability of 
a dual feasible solution that is not primal feasible to start with. It is the same as the 
simplex method applied to the dual problem but is developed such that it can make use 
of the same tableau as the primal method. Computationally, the dual simplex algorithm 
also involves a sequence of pivot operations, but with different rules (compared to the 
regular simplex method) for choosing the pivot element. 

Let the problem to be solved be initially in canonical form with some of the b t < 0, 
the relative cost coefficients corresponding to the basic variables Cj — 0, and all other 
cj > 0. Since some of the b, are negative, the primal solution will be infeasible, and 
since all Cj > 0, the corresponding dual solution will be feasible. Then the simplex 
method works according to the following iterative steps. 

1. Select row r as the pivot row such that 


If all a r j > 0, the primal will not have any feasible (optimal) solution. 

3 . Carry out a pivot operation on a rs 

4 . Test for optimality: If all b, > 0, the current solution is optimal and hence stop 
the iterative procedure. Otherwise, go to step 1. 


1. Since we are applying the simplex method to the dual, the dual solution will 
always be maintained feasible, and hence all the relative cost factors of the 
primal (cj) will be nonnegative. Thus the optimality test in step 4 is valid 
because it guarantees that all b, are also nonnegative, thereby ensuring a feasible 
solution to the primal. 


b r — min bj < 0 


(4.22) 


2 . Select column s as the pivot column such that 



Remarks: 
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2 . We can see that the primal will not have a feasible solution when all a,-j are 
nonnegative from the following reasoning. Let (xi, X 2 , . . . , x m ) be the set of 
basic variables. Then the rth basic variable, x r , can be expressed as 

n 

x r = b r — ^ ®rjXj 

j=m + 1 


It can be seen that if b r < 0 and Z7, ; > 0 for all j, x, cannot be made non- 
negative for any nonnegative value of xj. Thus the primal problem contains 
an equation (the rth one) that cannot be satisfied by any set of nonnegative 
variables and hence will not have any feasible solution. 

The following example is considered to illustrate the dual simplex method. 


Example 4.3 


Minimize / = 20x \ + 16 x 2 


subject to 


x\ > 2.5 

X2 > 6 

2x\ + X 2 > 17 
x\ +x 2 > 12 
xi > 0, x 2 > 0 


SOLUTION By introducing the surplus variables X 3 , X 4 , x$, and xg, the problem can 
be stated in canonical form as 
Minimize / 
with 


-Xl 

+ X3 

= -2.5 


- X2 + X4 

= -6 

— 2 xi 

- X 2 + x 5 

= -17 

-Xl 

- X 2 + X 6 

= -12 

20 xi 

-j- 16 x 2 

-/ =0 


Xi > 0 , i = 1 to 6 


The basic solution corresponding to (£j) is infeasible since X 3 = —2.5, X 4 = 
— 6 , X 5 = —17, and x 6 = —12. However, the objective equation shows optimality 
since the cost coefficients corresponding to the nonbasic variables are nonnegative 
(ci = 20, C 2 = 16). This shows that the solution is infeasible to the primal but feasible 
to the dual. Hence the dual simplex method can be applied to solve this problem as 
follows. 
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Step 1 Write the system of equations (E|) in tableau form: 


Basic 

variables 

Xl 

Variables 
x 2 x 3 

X 4 

*5 

x 6 

-/ 

bi 


x 3 

-1 

0 

1 

0 

0 

0 

0 

- 2.5 


X 4 

0 

-1 

0 

1 

0 

0 

0 

-6 


x 5 

EH 

Pivot element 

-1 

0 

0 

1 

0 

0 

-17 • 

«— Minimum, 
pivot row 

x 6 

-1 

-1 

0 

0 

0 

1 

0 

-12 


-f 

20 

16 

0 

0 

0 

0 

1 

0 



Select the pivotal row r such that 

b r = min(b, < 0) = £>3 = — 17 

in this case. Hence r = 3. 

Step 2 Select the pivotal column 5 as 

c s 

= min 


a r j < 0 \~ a rj 


Since 


Cl 

-<331 


20 c 2 16 

— = 10, = — = 16, and s = 1 

2 —CI 32 1 


Step 3 The pivot operation is carried on <231 in the preceding table, and the result is 
as follows: 


Basic 


Variables 






variables 

Xl 

*2 

X 3 

X4 

X5 

X6 

-/ 

bi 

*3 

0 

1 

2 

1 

0 

1 

2 

0 

0 

6 

X4 

0 

E3 

Pivot element 

0 

1 

0 

0 

0 

— 6 Minimum, 

pivot row 

Xl 

1 

1 

2 

0 

0 

1 

2 

0 

0 

17 

2 

x 6 

0 

1 

2 

0 

0 

1 

2 

1 

0 

7 

2 

-f 

0 

6 

0 

0 

10 

0 

1 

-170 


Step 4 Since some of the b t are < 0, the present solution is not optimum. Hence we 
proceed to the next iteration. 

Step 1 The pivot row corresponding to minimum ( bj < 0) can be seen to be 2 in the 
preceding table. 
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Step 2 Since a 22 is the only negative coefficient, it is taken as the pivot element. 
Step 3 The result of pivot operation on a 22 in the preceding table is as follows: 


Basic 


Variables 




-/ 

bi 


variables 

Xl X 2 x 3 


X 4 


X5 

x 6 

*3 

0 0 1 


1 

2 


1 

2 

0 

0 

3 


X 2 

0 1 0 


-1 


0 

0 

0 

6 


Xl 

1 0 0 


1 

2 


1 

2 

0 

0 

11 

2 


X 6 

0 0 0 


1 

2 


1 

2 

1 

0 

1 

2 

Minimum, 










pivot row 



Pivot element 







0 0 0 


6 


10 

0 

1 

-206 


Step 4 Since all bj are not > 0, the present solution is not optimum. Hence we go to 

the next iteration. 









Step 1 The pivot row (corresponding to minimum b, 

< 0) can be seen to be the fourth 

row. 










Step 2 Since 












c 4 

= 1 2 and 


C 5 

20 





— #44 



-a 45 




the pivot column is selected as s 

= 4. 






Step 3 The pivot operation is 

carried on 

#44 in 

the preceding table 

and the result is 

as follows: 









Basic 



Variables 






variables 

X\ X 2 

x 3 

X 4 


x 5 

x 6 

-f 

bi 

*3 

0 0 

1 


0 


-1 

1 

0 

5 

2 

*2 

0 1 

0 


0 


1 

-2 

0 

7 

Xl 

1 0 

0 


0 


-1 

1 

0 

5 

X 4 

0 0 

0 


1 


1 

-2 

0 

1 

-f 

0 0 

0 


0 


4 

12 

1 

-212 


Step 4 Since all /;, are > 0, the present solution is dual optimal and primal feasible. 
The solution is 

xi —5, X 2 — 7, X 3 = |, X 4 — 1 (dual basic variables) 
x 5 — X(, — 0 (dual nonbasic variables) 

/min = 212 
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4.4 DECOMPOSITION PRINCIPLE 

Some of the linear programming problems encountered in practice may be very large 
in terms of the number of variables and/or constraints. If the problem has some special 
structure, it is possible to obtain the solution by applying the decomposition principle 
developed by Dantzing and Wolfe [4.4]. In the decomposition method, the original 
problem is decomposed into small subproblems and then these subproblems are solved 
almost independently. The procedure, when applicable, has the advantage of making 
it possible to solve large-scale problems that may otherwise be computationally very 
difficult or infeasible. As an example of a problem for which the decomposition prin- 
ciple can be applied, consider a company having two factories, producing three and 
two products, respectively. Each factory has its own internal resources for production, 
namely, workers and machines. The two factories are coupled by the fact that there 
is a shared resource that both use, for example, a raw material whose availability is 
limited. Let b 2 and h 4 be the maximum available internal resources for factory 1, and 
let b 4 and bs be the similar availabilities for factory 2. If the limitation on the common 
resource is b\, the problem can be stated as follows: 

Minimize /(xj , x 2 , x 3 , y \ , y 2 ) = c { xi + c 2 x 2 + c 3 x 3 + c 4 yi + c 5 y 2 


subject to 


a n x\ + ai 2 * 2 + ai3*3 + auyi + a i 5 y 2 


< b\ 


a 2 ixi + a 22 x 2 + a 23 x 3 < b 2 

< 231*1 + a 32 x 2 + a 33 x 2 < b 3 


a 4 iyi+a 42 y 2 < b 4 

as\y\+a 52 y 2 < b 5 


(4.24) 


where x, and yj are the quantities of the various products produced by the two factories 
(design variables) and the a, ; - are the quantities of resource i required to produce 1 unit 
of product j . 

Xi > 0 , yj > 0 

0=1.2, 3) o=t, 2) 


An important characteristic of the problem stated in Eqs. (4.24) is that its constraints 
consist of two independent sets of inequalities. The first set consists of a coupling 
constraint involving all the design variables, and the second set consists of two groups 
of constraints, each group containing the design variables of that group only. This 
problem can be generalized as follows: 

Minimize /(X) = c[X , + cTX 2 + • ■ ■ + c^X p (4.25a) 


subject to 
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A,Xi 

+ a 2 x 2 + - 

■ + A p Xp — b 0 

(4.25b) 

B,X, 

b 2 x 2 

= b, 

= b 2 

(4.25c) 



B/,X p = b /7 



X!>0, X 2 >0, ,X p >0 


where 



Xl 


-* 77 ? 1 + 1 

Xj - 

X2 

, x 2 = 

-*77? 1+2 


Xm 1 


•*7?zl+m2 


Xm\+m2-\ j+1 

Xm\+m2-\ h»?p-i+2 

Xm\+m2-\ 1 -m p-\+m p 

Xil 

x 2 

V 



It can be noted that if the size of the matrix A& is (ro x mj) and that of B/ is (r, k x m^), 
the problem has YL'i-O r k constraints and Y^,k=\ m k variables. 

Since there are a large number of constraints in the problem stated in Eqs. (4.25), 
it may not be computationally efficient to solve it by using the regular simplex 
method. However, the decomposition principle can be used to solve it in an efficient 
manner. The basic solution procedure using the decomposition principle is given by 
the following steps. 


1. Define p subsidiary constraint sets using Eqs. (4.25) as 

BrXi =b! 

B 2 X 2 = b 2 


B/X/ = b/ 


(4.26) 


B pX p = b ; , 

The subsidiary constraint set 

B k X k = b k , k = 1,2, p (4.27) 

represents r k equality constraints. These constraints along with the requirement 
X k > 0 define the set of feasible solutions of Eqs. (4.27). Assuming that this set 
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of feasible solutions is a bounded convex set, let s k be the number of vertices 

of this set. By using the definition of convex combination of a set of points, ^ 

any point X A satisfying Eqs. (4.27) can be represented as 

X, = AhuX® + wt.jXf + • • • + MM*X® (4.28) 

Pk.i + Ma,2 + ■ ■ ■ + H-k,s k — 1 (4.29) 

0 < i^k.i < 1, i = 1, 2, . . . , Sk, k = 1, 2, . . . , p (4.30) 


where Y.! 1 ..... X ‘ v f 1 are the extreme points of the feasible set defined by 
Eqs. (4.27). These extreme points X®, X®, . . . , X®; k = 1, 2, . . . , p, can be 
found by solving the Eqs. (4.27). 

2. These new Eqs. (4.28) imply the complete solution space enclosed by the con- 
straints 


BaXa - b A 

X A > 0, k = 1, 2, . . . , p 


(4.31) 


By substituting Eqs. (4.28) into Eqs. (4.25), it is possible to eliminate the 
subsidiary constraint sets from the original problem and obtain the following 
equivalent form: 


Minimize /(X) = c} +cT 


sp 




(p) 


W=1 


subject to 


5 ] \ / S ' 7 \ / A P 

(1) I i A / V s , , y(2) 


Ai ( E Mi J + A 2 ^E M2,*X. j + f- A p ^E 

•5i 

Emu 

i=i 

•52 

E M2,?' 

?=i 

Sp 

Pp.i 

i = 1 


Cp) 


= b 0 

= 1 
= 1 
= 1 


"Tf X (I> and X (2) are any two points in an ft -dimensional space, any point lying on the line segment joining 
X (l) and X l2) is given by a convex combination of X 11 ’ and X l2) as 

X(/r) = ju X (I) + (1 — /r) X (2) , 0 < /x < 1 

This idea can be generalized to define the convex combination of r points X 11 ’. X 12 ’, . . X lr ' as 
X(/X!, fi 2 , ■ ■ • , Hr) = /x,X (1) + /x 2 X (2) + ■ • • + pX r) 
where fii + H 2 + ■ • • + /z r = 1 and 0 < fi; < 1, i = 1, 2, . . . , r. 
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Bj,i > 0, i = 1, 2, . . . , Sj, j = l,2,...,p (4.32) 

Since the extreme points X®, X® X® are known from the solu- 

tion of the set B^X^ = b&, X^ > 0, k = 1, 2, . . . , p, and since C k and 
A*, k = 1,2,...,/?, arc known as problem data, the unknowns in Eqs. (4.32) 
ar e p.jj, i — 1,2 , ,sj\ j — 1,2,..., p. Hence fijj will be the new decision 
variables of the modified problem stated in Eqs. (4.32). 

3 . Solve the linear programming problem stated in Eqs. (4.32) by any of the known 
techniques and find the optimal values of Once the optimal values ji* ■ are 
determined, the optimal solution of the original problem can be obtained as 


where 


Remarks: 



sk 

V 


z^ x ; 




(=i 


k — 1,2,...,/? 


1. It is to be noted that the new problem in Eqs. (4.32) has (ro + p) equality con- 
straints only as against ro + r^ in the original problem of Eq. (4.25). Thus 
there is a substantial reduction in the number of constraints due to the applica- 
tion of the decomposition principle. At the same time, the number of variables 
might increase from m* to ,s>, depending on the number of extreme 
points of the different subsidiary problems defined by Eqs. (4.31). The modified 
problem, however, is computationally more attractive since the computational 
effort required for solving any linear programming problem depends primarily 
on the number of constraints rather than on the number of variables. 

2 . The procedure outlined above requires the determination of all the extreme 
points of every subsidiary constraint set defined by Eqs. (4.31) before the opti- 
mal values ji* ; are found. However, this is not necessary when the revised 
simplex method is used to implement the decomposition algorithm [4.5]. 

3 . If the size of the problem is small, it will be convenient to enumerate all the 
extreme points of the subproblems and use the simplex method to solve the 
problem. This procedure is illustrated in the following example. 


Example 4.4 A fertilizer mixing plant produces two fertilizers, A and B, by mixing 
two chemicals, C\ and C 2 , in different proportions. The contents and costs of the 
chemicals C\ and C 2 are as follows: 


Chemical 

Contents 

Ammonia Phosphates 

Cost (S/lb) 

Ci 

0.70 

0.30 

5 

c 2 

0.40 

0.60 

4 
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Fertilizer A should not contain more than 60% of ammonia and B should contain 
at least 50% of ammonia. On the average, the plant can sell up to 1000 lb/hr and due 
to limitations on the production facilities, not more than 6001b of fertilizer A can be 
produced per hour. The availability of chemical C\ is restricted to 500 lb/hr. Assuming 
that the production costs are same for both A and B, determine the quantities of A 
and B to be produced per hour for maximum return if the plant sells A and B at the 
rates of $6 and $7 per pound, respectively. 


SOLUTION Let xi and X 2 indicate the amounts of chemicals C\ and C 2 used in 
fertilizer A, and y\ and y 2 in fertilizer B per hour. Thus the total amounts of A and 
B produced per hour are given by x\ + x 2 and vi + >' 2 , respectively. The objective 
function to be maximized is given by 


/ = selling price — cost of chemical Cj and C 2 


= 6(xi +x 2 ) + 7(yi + y 2 ) - 5(xi + yi) - 4(x 2 + y 2 ) 


The constraints are given by 


(xi + xi) + (yi + yi) 

< 1000 

xi +yi 

<500 

Xl +x 2 

< 600 

+ To x 2 

< (X j +X 2 ) 

TO3T + TO-V2 

> w(yi + ^ 2 ) 


(amount that can be sold) 
(availability of Cj) 
(production limitations on A) 

( A should not contain more 
than 60% of ammonia) 

( B should contain at least 
50% of ammonia) 


Thus the problem can be restated as 


subject to 


xi+x 2 + yi+ y 2 
xi + y\ 


Xl +x 2 
xi — 2x 2 


x i > 0 , yi > 0 , ( = 1.2 


- 2yi + 3y2 

(Ei) 

< 1000 


< 500 

(E 2 ) 

< 600 


< 0 

(E 3 ) 

< 0 

(E 4 ) 

= 1,2 



This problem can also be stated in matrix notation as follows: 

Maximize /(X) = C { X ] + C 1 X 2 


4.4 
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subject to 

A,X!+ A 2 X 2 <bo 
B |X i <b, 

B 2 X 2 < b 2 
X, >0, X 2 > 0 


where 



A, 
B i 

X 



'1 1 


[A 2] = 

'1 r 

[1000 

— 

1 0 

5 

1 0 

’ b ° _ { 500 


'1 

r 


[600] 

, B 2 = {—2 1 

= 

1 - 

2 

, b, = 

0 


(M 






[ X 2 J 






(Es) 


Step 1 We first consider the subsidiary constraint sets 


BjXj < b u 

0 

Al 

>< 

(Eg) 

B 2 X 2 < b 2 , 

X 2 >0 

(E 7 ) 


The convex feasible regions represented by (Eg) and (E7) are shown in Fig. 4 . la 
and b, respectively. The vertices of the two feasible regions are given by 

X j 1 ' = point P — 1 


X!, 1 ’ = point Q 
X = point R 


j 0 

[600 

[400 

[200 



Figure 4.1 Vertices of feasible regions. To make the feasible region bounded, the constraint 
yi < 1000 is added in view of Eq. (E 2 ). 
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X j 2) = point S — | 


X® = point T = 


X', ' = point U — 


( 1000 

[2000 

{1000 

1 o 


Thus any point in the convex feasible sets defined by Eqs. (Eg) and (E 7 ) can 
be represented, respectively, as 


X 1 = 11 11 
with 


| 0 | 

0 


Ml 2 


I 0 I 

I 600 I 


Mi3 


I 400 I _ I 400/ri3 

200 I _ I 600/r 12 + 200 mi 3 


Mil + M 12 + Mi3 = 1, 0<Mii<l» i = l,2, 3 


(Eg) 


and 


M 22 


1000 
2000 

| IOOOM 22 + IOOOM 23 j 

2000M22 


X 2 = M 21 

I 


M 23 


1000 

0 




with 


M 21 + M 22 + M 23 — 1 ; 0 < M 2 ? < 1 • f — 1 , 2,3 


(Eg) 


Step 2 By substituting the relations of (Eg) and (Eg), the problem stated in Eqs. (E 5 ) 
can be rewritten as 


Maximize /(mu, M 12 , • • • , M 23 ) = (1 2) 


400/213 

600 m 12 + 200 m 13 


, n ,, | IOOO/Z 22 + IOOOM 23 
+ 2000M22 

= 8 OOM 13 4~ 1200m 12 4” 8 OOOM 22 4~ 2000M23 


subject to 


that is, 


'1 f 

400/xn 

1 0 

[600/^12 “l - 2 OO/X 13 


'1 r 

{ 1000 M 22 4 - 1000 M 23 

1 0 

| 2000 M 22 


{ 1000 
\ 500 


600 m 12 4- 600 m 13 + 3000M22 4- IOOOM 23 < 1000 
4 OO /213 + IOOOM 22 4- 1000 m 23 < 500 
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Mil + Ml2 + Ml3 = 1 
M 21 + M22 + M 23 = 1 

with 

Mil > 0, M 12 > 0, Mi 3 > 0, M 21 > 0, M 22 > 0, M 23 > 0 

The optimization problem can be stated in standard form (after adding the slack 
variables a and j5) as 

Minimize / = — 1200 mi2 — 8 OOM 13 — 8 OOOM 22 — 2000M23 

subject to 

600m 12 T - 600m 13 3000M22 T - IOOOM 23 T ct — 1000 

4 OO/X 13 + IOOOM 22 + IOOOM 23 + P = 500 

Mil + M 12 + Mi3 = 1 (E 10 ) 

M 21 + M 22 + M 23 = 1 

MM > 0 (i = 1,2; j = 1,2,3), a > 0, £>0 

Step 3 The problem (Ejo) can now be solved by using the simplex method. 


4.5 SENSITIVITY OR POSTOPTIMALITY ANALYSIS 

In most practical problems, we are interested not only in optimal solution of the LP 
problem, but also in how the solution changes when the parameters of the problem 
change. The change in the parameters may be discrete or continuous. The study of 
the effect of discrete parameter changes on the optimal solution is called sensitivity 
analysis and that of the continuous changes is termed parametric programming . One 
way to determine the effects of changes in the parameters is to solve a series of new 
problems once for each of the changes made. This is, however, very inefficient from a 
computational point of view. Some techniques that take advantage of the properties of 
the simplex solution are developed to make a sensitivity analysis. We study some of 
these techniques in this section. There are five basic types of parameter changes that 
affect the optimal solution: 

1. Changes in the right-hand-side constants /;, 

2. Changes in the cost coefficients cj 

3. Changes in the coefficients of the constraints a,j 

4. Addition of new variables 

5. Addition of new constraints 

In general, when a parameter is changed, it results in one of three cases: 

1. The optimal solution remains unchanged; that is, the basic variables and their 
values remain unchanged. 

2. The basic variables remain the same but their values are changed. 

3. The basic variables as well as their values are changed. 
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4.5.1 Changes in the Right-Hand-Side Constants fo, 

Suppose that we have found the optimal solution to a LP problem. Let us now change 
the bj to bj + Abj so that the new problem differs from the original only on the 
right-hand side. Our interest is to investigate the effect of changing bj to bj + Ah, on 
the original optimum. We know that a basis is optimal if the relative cost coefficients 
corresponding to the nonbasic variables Cj are nonnegative. By considering the pro- 
cedure according to which cj are obtained, we can see that the values of cj are not 
related to the bj . The values of cj depend only on the basis, on the coefficients of the 
constraint matrix, and the original coefficients of the objective function. The relation 
is given in Eq. (4.10): 

Cj = cj — jt t A j = Cj — c} ; B ~ 1 A j (4.33) 

Thus changes in bj will affect the values of basic variables in the optimal solution and 
the optimality of the basis will not be affected provided that the changes made in bj do 
not make the basic solution infeasible. Thus if the new basic solution remains feasible 
for the new right-hand side, that is, if 

X' B = B'*(b+ Ab) > 0 (4.34) 

then the original optimal basis, B , also remains optimal for the new problem. Since the 
original solution, say 1 ' 



is given by 


X B = B ‘b 


Equation (4.34) can also be expressed as 

m 

X- = Xj + ^ Pij Abj > 0, 
1=1 


i — 1,2, ... ,m 


(4.35) 


(4.36) 


where 


B” 1 - IM (4.37) 

Hence the original optimal basis B remains optimal provided that the changes made in 
bi, Abj, satisfy the inequalities (4.36). The change in the value of the / th optimal basic 
variable, Axj, due to the change in /;,■ is given by 

X' B -X b = AX b = B 1 Ab 


f It is assumed that the variables are renumbered such that the first m variables represent the basic variables 
and the remaining n — m the nonbasic variables. 
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that is, 

m 

A Xj — ^ fiijAbj, i = 1,2 , ,m (4.38) 

l=i 

Finally, the change in the optimal value of the objective function (A/) due to the 
change A bj can be obtained as 

m 

A/ = AX B =c£B~ 1 Ab = jr 7 'Ab = ^7r / Ab / (4.39) 

1=1 

Suppose that the changes made in £>, ( AZ?, ) are such that the inequality (4.34) is violated 
for some variables so that these variables become infeasible for the new right-hand-side 
vector. Our interest in this case will be to determine the new optimal solution. This can 
be done without reworking the problem from the beginning by proceeding according 
to the following steps: 

1. Replace the /?,■ of the original optimal tableau by the new values, b = B 1 (b + 
Ab) and change the signs of all the numbers that are lying in the rows in which 
the infeasible variables appear, that is, in rows for which b t < 0. 

2 . Add artificial variables to these rows, thereby replacing the infeasible variables 
in the basis by the artificial variables. 

3 . Go through the phase I calculations to find a basic feasible solution for the 
problem with the new right-hand side. 

4 . If the solution found at the end of phase I is not optimal, we go through the 
phase II calculations to find the new optimal solution. 

The procedure outlined above saves considerable time and effort compared to the 
reworking of the problem from the beginning if only a few variables become infea- 
sible with the new right-hand side. However, if the number of variables that become 
infeasible are not few, the procedure above might also require as much effort as the 
one involved in reworking of the problem from the beginning. 

Example 4.5 A manufacturer produces four products, A, B. C, and D, by using two 
types of machines (lathes and milling machines). The times required on the two machines 
to manufacture 1 unit of each of the four products, the profit per unit of the product, and 
the total time available on the two types of machines per day are given below: 


Machine 

Time required per unit (min) for product: 

Total time available 
per day (min) 

A 

B 

c 

D 

Lathe machine 

7 

10 

4 

9 

1200 

Milling machine 

3 

40 

1 

1 

800 

Profit per unit ($) 

45 

100 

30 

50 



Find the number of units to be manufactured of each product per day for maximizing 
the profit. 

Note: This is an ordinary LP problem and is given to serve as a reference problem 
for illustrating the sensitivity analysis. 
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SOLUTION Let xi, X 2 , X 3 , and X 4 denote the number of units of products A, B, C, 
and D produced per day. Then the problem can be stated in standard form as follows: 

Minimize / = — 45xi — 100 x 2 — 30x3 — 50x4 

subject to 

7xi + 10 x 2 + 4 x 3 + 9 x 4 < 1200 
3xi + 40x2 + X 3 + X 4 < 800 
x, ■ >0, i = 1 to 4 

By introducing the slack variables X 5 > 0 and X(, > 0, the problem can be stated in 
canonical form and the simplex method can be applied. The computations are shown 
in tableau form below: 


Basic 




Variables 






Ratio bj /di s 

variables 

Xl 

X2 


x 3 

X4 

*5 

X6 

-/ 

bi 

for a,* > 0 

x 5 

7 

10 


4 

9 

1 

0 

0 

1200 

120 

x 6 

3 

40 


1 

1 

0 

1 

0 

800 

20 4 - Smaller 











one, xs leaves 


Pivot element 







the basis 

-f 

-45 - 

100 


-30 

-50 

0 

0 

1 

0 




t 










Minimum 

Cj < 0; X 2 enters the next basis 




Result of pivot operation: 








x 5 

25 

4 

0 


15 

4 

35 

4 

1 

1 

4 

0 

1000 

^§2 ^—Smaller 











one, X 5 leaves 





Pivot element 




the basis 

X2 

3 

40 

1 


1 

40 

1 

40 

0 

1 

40 

0 

20 

800 

-/ 

75 

2 

0 


55 

2 

95 

2 

0 

5 

2 

1 

2000 







t 










Minimum cj 

< 0 , X 4 enters the basis 


Result of pivot operation: 








X4 

5 

7 

0 


0 

1 

4 

35 

1 

35 

0 

4,000 

35 

222 Smaller 
one, X4 leaves 




Pivot element 






the basis 

X2 

2 

35 

1 


1 

70 

0 

1 

350 

9 

350 

0 

120 

7 

1200 

-/ 

25 

7 

0 


50 

7 

0 

38 

7 

8 

7 

1 

52,000 

7 



t 

Minimum cj < 0, X 3 enters the basis 
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Result of pivot operation: 


*3 

5 

3 

0 

l 

7 

3 

4 

15 

1 

15 

0 

800 

3 

*2 

1 

30 

l 

0 

1 

30 

1 

150 

2 

75 

0 

40 

3 


25 

0 

0 

50 

22 

2 

l 

28,000 

J 

3 

3 

3 

3 

3 


The optimum solution is given by 


40 800 

xi — — , X 3 = — — (basic variables) 


Xl 


/min — 


: X 4 = X 5 = X 6 

-28,000 


0 (nonbasic variables) 

$28,000 


or maximum profit = 


From the final tableau, one can find that 

X B 


c B = 



( 800 1 

|*3 

- M- 

1*2 

1 12 


l 3 > 

\C3 

| -30 

ic 2 j 

“ j -100 


vector of basic variables in 
the optimum solution 


i-t 


4 10 
1 40 


fi 33 fin 
fin fin 


vector of original cost 
= coefficients corresponding 
to the basic variables 


matrix of original coefficients 
corresponding to the basic variables 

inverse of the coefficient 
= matrix B, which appears 
in the final tableau also 


- 

1 

i - 
15 

- 

1 

_ 150 

1 


K — clB^ 1 = (-30 - 100) 


r 4_ 

15 


L 150 


J_1 

' 15 

_ 2 _ 

75. 


r _ 22 I simplex multipliers, the 
3 | = negatives of which appear 


in the final tableau also 


(Ei) 

(E 2 ) 

(E 3 ) 

(E 4 ) 


(Es) 


Example 4.6 Find the effect of changing the total time available per day on the two 
machines from 1200 and 800min to 1500 and lOOOmin in Example 4.5. 

SOLUTION Equation (4.36) gives 

m 

Xj + ^ fiijAbj >0, / = 1, 2, . . . , m (4.36) 

7 = 1 

where jq is the optimum value of the ith basic variable. (This equation assumes that 
the variables are renumbered such that x\ to x m represent the basic variables.) 
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If the variables are not renumbered, Eq. (4.36) will be applicable for i — 3 and 
2 in the present problem with A /?3 = 300 and A /to = 200. From Eqs. (Ei) to (Eg) of 
Example 4.5, the left-hand sides of Eq. (4.36) become 


x 3 + /I33 AZ?3 + /I32 AZto 


^ + ^(300) - ^(200) = ^00 


x 2 + $23A£>3 + Pl2&b2 — Y _ 13o( 300) I 75 v-w; — 150 


— (200) = 2^ 


Since both these values are > 0, the original optimal basis B remains optimal even 
with the new values of The new values of the (optimal) basic variables are given 
by Eq. (4.38) as 


X'» = 


1*3 

\x' 


— Xg + AX_g — X, 


B _1 Ab 


800 ’ 
40 


I* 

4 

15 

1 - 
15 

{300 

1 

m 

1 

. 150 

2 

75 - 

{200 

1 = 1 

if) 


and the optimum value of the objective function by Eq. (4.39) as 

28,000 


/min — /min + A/ — /min + C fi AX g 


3 

35,000 


+ (-30 - 100) 


200 ’ 
10 

. 3 j 


Thus the new proht will be $35,000/3. 


4.5.2 C hanges in the C ost C oefficients cj 

The problem here is to hnd the effect of changing the cost coefficients from Cj to 
Cj + A Cj on the optimal solution obtained with Cj. The relative cost coefficients cor- 
responding to the nonbasic variables, x m +\, x m + 2 , . . . , x n are given by Eq. (4.10): 

m 

Cj = Cj - 7T 1 A ; = Cj — ^ 7 TiOij, j — m + 1, m + 2, . . . , n (4.40) 
/= 1 

where the simplex multipliers m are related to the cost coefficients of the basic variables 
by the relation 

_T .Tp-1 

n — C B a 

that is, 

m 

TZi = ^ Ckfai , * = 1,2,---, m (4.41) 

k= 1 

From Eqs. (4.40) and (4.41), we obtain 

m / m \ m / m \ 

c j — Cj — a ij ( CkPki I = Cj — Ck ( O-ij Pki I • 
i=l \*=1 / k = 1 \ i = l / 


i = m + 1, m + 2, . . . , n 


(4.42) 
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If the Cj are changed to cj + A cy, the original optimal solution remains optimal, pro- 
vided that the new values of cy , c'j satisfy the relation 


m / m \ 

c'j = Cj + A Cj - X (c* + Act) ( X a 'jP ki ) - 0 

k=\ \i= l / 

m / m \ 

= Cj + A Cj - X Ac M X Pki ) - °- 

k=l \i = 1 / 

j — m + 1 , m + 2, ■ • ■ , « (4.43) 

where cy indicate the values of the relative cost coefficients corresponding to the 
original optimal solution. 

In particular, if changes are made only in the cost coefficients of the nonbasic 
variables, Eq. (4.43) reduces to 

cj + A Cj >0, j — m + 1 , m + 2, . . . , n (4.44) 

If Eq. (4.43) is satisfied, the changes made in Cj, Acy, will not affect the optimal basis 
and the values of the basic variables. The only change that occurs is in the optimal 
value of the objective function according to 

m 

A / = E XjAcj (4.45) 

j = i 

and this change will be zero if only the cj of nonbasic variables are changed. 

Suppose that Eq. (4.43) is violated for some of the nonbasic variables. Then it 
is possible to improve the value of the objective function by bringing any nonbasic 
variable that violates Eq. (4.43) into the basis provided that it can be assigned a nonzero 
value. This can be done easily with the help of the previous optimal tableau. Since 
some of the c'- are negative, we start the optimization procedure again by using the old 
optimum as an initial feasible solution. We continue the iterative process until the new 
optimum is found. As in the case of changing the right-hand-side bj, the effectiveness 
of this procedure depends on the number of violations made in Eq. (4.43) by the new 
values Cj + A Cy. 

In some of the practical problems, it may become necessary to solve the opti- 
mization problem with a series of objective functions. This can be accomplished 
without reworking the entire problem for each new objective function. Assume that 
the optimum solution for the first objective function is found by the regular proce- 
dure. Then consider the second objective function as obtained by changing the first 
one and evaluate Eq. (4.43). If the resulting c ’j± 0, the old optimum still remains 
as optimum and one can proceed to the next objective function in the same manner. 
On the other hand, if one or more of the resulting c'j < 0, we can adopt the proce- 
dure outlined above and continue the iterative process using the old optimum as the 
starting feasible solution. After the optimum is found, we switch to the next objective 
function. 


214 


Linear Programming II: Additional Topics and Extensions 


Example 4.7 Find the effect of changing C 3 from —30 to —24 in Example 4.5. 


SOLUTION Here AC 3 = 6 and Eq. (4.43) gives that 

Cl = C\ + Aci - Ac 3 [fl2i/632 + ^ 31 /^ 33 ] = y + 0 - 6[3(— jg) + 7(^)] = 

c' 4 — C4 + AC 4 — AC3[fl24^32 + fl 34^33l = y + 0 — 6[1(— yg) + 9(-j^)] = | 

c' 5 — C 5 + Acs - Ac3[a25^32 + 035^33] = y + 0 — 6[0( — -j^) + H^)l = ff 

c' 6 — ^6 + AC6 — AC3[fl26^32 + «36^33] = § + 0 — 6[1 (— y ) + 0(^)] = j| 


The change in the value of the objective function is given by Eq. (4.45) as 


A / = Ac 3 jc 3 


4800 28,000 

so that f — h 

3 J 3 


4800 


23,200 

3 


Since cj is negative, we can bring x\ into the basis. Thus we start with the optimal 
tableau of the original problem with the new values of relative cost coefficients and 
improve the solution according to the regular procedure. 















Variables 






Ratio bi /Qij 

Basic variables 

Xl 

X2 

*3 

X4 

*5 

X 6 

-/ 

b, 

for fly > 0 

*3 

5 

3 

0 

1 

7 

3 

4 

15 

1 

15 

0 

800 

3 

160 <- 

Pivot element 








*2 

1 

30 

1 

0 

1 

30 

1 

150 

2 

75 

0 

40 

3 

400 

-/ 


0 

0 

8 

86 

16 

1 

23,200 


3 

3 

15 

15 

3 



t 










X, 

1 

0 

3 

5 

7 

5 

4 

25 

1 

25 

0 

160 


X2 

0 

1 

1 

50 

2 

25 

3 

250 

7 

250 

0 

8 



0 

0 

1 

5 

6 

1 

1 

8000 



Since all the relative cost coefficients are nonnegative, the present solution is optimum 
with 


x\ — 160, X 2 — 8 (basic variables) 
xt, — x 4 — X 5 = xg = 0 (nonbasic variables) 
f min — —8000 and maximum profit = $8000 


4.5.3 Addition of New Variables 

Suppose that the optimum solution of a LP problem with n variables x\. X 2 , . . . , x n 
has been found and we want to examine the effect of adding some more variables 
x n+ k, k — 1,2,..., on the optimum solution. Let the constraint coefficients and the 


4.5 Sensitivity or Postoptimality Analysis 215 


cost coefficients corresponding to the new variables x n+ k be denoted by ai t „ + k, i — 1 
to m and c„+k, respectively. If the new variables are treated as additional nonbasic 
variables in the old optimum solution, the corresponding relative cost coefficients are 
given by 

ra 

Cn+k = Cn+k ^ ' JTj Cl\ n \k (4.46) 

;= 1 

where n\, n 2 , . . . , n m are the simplex multipliers corresponding to the original optimum 
solution. The original optimum remains optimum for the new problem also provided 
that c n+ k > 0 for all k. However, if one or more c n+ k < 0, it pays to bring some of 
the new variables into the basis provided that they can be assigned a nonzero value. 
For bringing a new variable into the basis, we first have to transform the coefficients 
a- i n+ k into Ttj n+ i- so that the columns of the new variables correspond to the canonical 
form of the old optimal basis. This can be done by using Eq. (4.9) as 

A n +k = B A 

rax 1 mxm rax 1 

that is, 

ra 

a t ,n+k = 5 ZP'j a j'n+k , i = 1 to m (4.47) 

j = i 

where B _1 = [/i, ; J is the inverse of the old optimal basis. The rules for bringing a new 
variable into the basis, finding a new basic feasible solution, testing this solution for 
optimality, and the subsequent procedure is same as the one outlined in the regular 
simplex method. 

Example 4.8 In Example 4.5, if a new product, E, which requires 15 min of work on 
the lathe and 10 min on the milling machine per unit, is available, will it be worthwhile 
to manufacture it if the profit per unit is $40? 

SOLUTION Let Xk be the number of units of product E manufactured per day. Then 
Ck — —40, aik = 15, and a 2 k = 10; therefore, 

c k = c k - 7Z\ci\ k - n 2 a 2 k = -40 + (y)(15) + (§)(10) = > 0 

Since the relative cost coefficient Ck is nonnegative, the original optimum solution 
remains optimum for the new problem also and the variable Xk will remain as a nonbasic 
variable. This means that it is not worth manufacturing product E. 

4.5.4 C hanges in the C onstraint C oefficients 

Here the problem is to investigate the effect of changing the coefficient n, ; - to a tJ + Aa, ; 
after finding the optimum solution with a (/ - . There are two possibilities in this case. The 
first possibility occurs when all the coefficients ciij, in which changes are made, belong 
to the columns of those variables that are nonbasic in the old optimal solution. In this 
case, the effect of changing a !; - on the optimal solution can be investigated by adopting 
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the procedure outlined in the preceding section. The second possibility occurs when 
the coefficients changed a\j correspond to a basic variable, say, xjo of the old optimal 
solution. The following procedure can be adopted to examine the effect of changing 

@i,j 0 ft) Cl\ jQ -(- 

1. Introduce a new variable x n +\ to the original system with constraint coefficients 

di , n + 1 = aijo + A cijjo (4.48) 

and cost coefficient 

c, i+ i = cjo (original value itself) (4.49) 

2 . Transform the coefficients a, ,, + ] to a ijl+ \ by using the inverse of the old optimal 
basis, B~* = [frj], as 

m 

«i , n + 1 = Y Pijaj, n + 1 , i — 1 to m (4.50) 

j = i 

3 . Replace the original cost coefficient (c ; o) of xjo by a large positive number N, 
but keep c„ + i equal to the old value cjq. 

4 . Compute the modified cost coefficients using Eq. (4.43): 

m / m 

c'j = Cj + A Cj - Y Aq ( J2 a uPki 

k = 1 \i=l 

j — m + 1, m + 2, • • • , n, n + 1 (4.51) 

where Aq = 0 for k — 1 , 2, . . . , jo — 1 , 70 + 1 , . . . , m and Ac/o — N — Cjo- 

5. Carry the regular iterative procedure of simplex method with the new objective 
function and the augmented matrix found in Eqs. (4.50) and (4.51) until the 
new optimum is found. 

Remarks: 

1. The number N has to be taken sufficiently large to ensure that Xjq cannot be 
contained in the new optimal basis that is ultimately going to be found. 

2 . The procedure above can easily be extended to cases where changes in coeffi- 
cients ajj of more than one column are made. 

3 . The present procedure will be computationally efficient (compared to reworking 
of the problem from the beginning) only for cases where there are not too many 
number of basic columns in which the ajj are changed. 

Example 4.9 Find the effect of changing Ai from {!} to { 'jj in Example 4.5 (i.e., 
changes are made in the coefficients a,j of nonbasic variables only). 


SOLUTION The relative cost coefficients of the nonbasic variables (of the original 
optimum solution) corresponding to the new ajj are given by 

cj = Cj — rr T A 7 -, j — nonbasic (1. 4, 5, 6) 
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Since A i is changed, we have 

ci = ci — jt t A, = -45 - (-f - f) jjQ 


17 

3 


As ci is positive, the original optimum solution remains optimum for the new problem 
also. 


Example 4.10 Find the effect of changing A] from {!} to j } in Example 4.5. 


SOLUTION 
is given by 


The relative cost coefficient of the nonbasic variable xi for the new A i 


ci = ci — n T A\ — —45 — (— y — |) 



13 

3 


Since c\ is negative, xi can be brought into the basis to reduce the objective function 
further. For this we start with the original optimum tableau with the new values of A i 
given by 


A, = B“‘Ai = 



Variables 


Basic variables 

XI 


*2 

x 3 

X 4 

x 5 

x 6 

-/ 

bi 

0 bi/a is ) 

x 3 


14 

15 


0 

1 

1 

3 

4 

15 

1 

15 

0 

800 

3 

4000 

14 

X2 


19 

150 


1 

0 

1 

30 

1 

150 

2 

75 

0 

40 

3 

2000 

19 


Pivot element 











13 

3 


0 

0 

50 

3 

22 

3 

2 

3 

1 

28,000 

3 


t 

X 3 


0 


140 

19 

1 

49 

19 

6 

19 

5 

19 

0 

3,200 

19 


Xi 


i 


150 

19 

0 

5 

19 

1 

19 

4 

19 

0 

2,000 

19 


-f 


0 


650 

19 

0 

295 

19 

135 

19 

30 

19 

1 

186,000 

19 


Since all Cj 

are nonnegative, 

the present tableau gives the 

new 

optimum solution as 


Xl 

= 2000/19, x 3 


3200/19 

(basic variables) 




x 2 = X 4 — X 5 = X6 = 0 (nonbasic variables) 

186,000 $186,000 
and maximum profit = 


/min — 


19 


19 
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4.5.5 Addition of Constraints 

Suppose that we have solved a LP problem with m constraints and obtained the optimal 
solution. We want to examine the effect of adding some more inequality constraints on 
the original optimum solution. For this we evaluate the new constraints by substituting 
the old optimal solution and see whether they are satisfied. If they are satisfied, it means 
that the inclusion of the new constraints in the old problem would not have affected 
the old optimum solution, and hence the old optimal solution remains optimal for the 
new problem also. On the other hand, if one or more of the new constraints are not 
satisfied by the old optimal solution, we can solve the problem without reworking the 
entire problem by proceeding as follows. 

1. The simplex tableau corresponding to the old optimum solution expresses all the 
basic variables in terms of the nonbasic ones. With this information, eliminate 
the basic variables from the new constraints. 

2. Transform the constraints thus obtained by multiplying throughout by — 1 . 

3. Add the resulting constraints to the old optimal tableau and introduce one arti- 
ficial variable for each new constraint added. Thus the enlarged system of 
equations will be in canonical form since the old basic variables were elim- 
inated from the new constraints in step 1. Flence a new basis, consisting of the 
old optimal basis plus the artificial variables in the new constraint equations, 
will be readily available from this canonical form. 

4. Go through phase I computations to eliminate the artificial variables. 

5. Go through phase II computations to find the new optimal solution. 

Example 4.11 If each of the products A, B, C, and D require, respectively, 2, 5, 3, 
and 4 min of time per unit on grinding machine in addition to the operations specified 
in Example 4.5, find the new optimum solution. Assume that the total time available 
on grinding machine per day is 600 min and all this time has to be utilized fully. 

SOLUTION The present data correspond to the addition of a constraint that can be 
stated as 


2x\ + 5x2 + 3x3 + 4x4 = 600 (Ei) 

By substituting the original optimum solution, 

X 2 = f, x 3 = Xi = x 4 = x 5 = x 6 = 0 

the left-hand side of Eq. (Ei) gives 

2(0) + 5(f) + 3(f2) + 4(0) = ^ 6Q0 

Thus the new constraint is not satisfied by the original optimum solution. Flence we 
proceed as follows. 

Step 1 From the original optimum tableau, we can express the basic variables as 

r, - M _ 5 r , _ 7 _ _4 „ , J_„ 

A3 — 3 3AI 3 A 4 15-^5 "T" i5*'' , 6 

x 2 — ~ 30^1 ^ 30^4 T50^ 75 -^6 
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Thus Eq. (Ei) can be expressed as 

2*1 5(— 30^1 30*^4 150*^ 

+ 3(™ - |xi - |x 4 - ^-^5 + YjXg) + 4.r 4 = 600 

that is, 

_ 19 v _ 17 v _ 23 v , 1 v _ _ 800 /c n 

5 -^1 g %4 30^-5 i 15^6 — 3 \X^2) 

Step 2 Transform this constraint such that the right-hand side becomes positive, 
that is, 

19 r _L_ !7 v , 23 v _ 1 v _ 800 m 3 

^ Xi -p g X4 T 20^5 15^6 — 3 vl 23 ) 

Step 3 Add an artifical variable, say, x k , the new constraint given by Eq. (E3) and the 
infeasibility form w = x k into the original optimum tableau to obtain the new 
canonical system as follows: 


Basic 





Variables 








variables 


x\ 


X 2 

X3 

X 4 

*5 

x 6 

Xk 

-/ 

—w 

bi 

( bi/a is ) 

*3 


5 

3 


0 

1 

7 

3 

4 

5 

1 

15 

0 

0 

0 

800 

3 

160 

X 2 


1 

30 


1 

0 

1 

30 

1 

150 

2 

75 

0 

0 

0 

40 

3 

400 

X k 


19 

6 


0 

0 

17 

6 

23 

30 

1 

15 

1 

0 

0 

800 

3 

1600 

19 


Pivot element 











1 1 

£ ^ 


25 

3 

19 

6 


0 

0 

0 

0 

50 

3 

17 

6 

22 

3 

23 

30 

2 

3 

1 

15 

0 

0 

1 

0 

0 

1 

28,000 

3 

800 

3 




t 












Step 4 Eliminate the artificial variable by applying the phase I procedure: 


Basic 





Variables 







variables 

XI 

X 2 

X 3 

X 4 


*5 

X 6 


Xk 

-/ 

—w 

bi 

X 3 

X 2 

XI 

0 

0 

1 

0 

1 

0 

1 

0 

0 

16 

19 

6 

95 

17 

19 


113 

285 

7 

475 

23 

95 

3 

95 

13 

475 

2 

95 


10 

19 

1 

95 

6 

19 

0 

0 

0 

0 

0 

0 

2,400 

19 

200 

19 

1,600 

19 


0 

0 

0 

175 

19 


101 

19 

16 

19 


50 

19 

1 

0 

164,000 

19 

—w 

0 

0 

0 

0 


0 

0 


0 

0 

1 

0 


Thus the new optimum solution is given by 

x\ — xi = =nr, X 3 — =400 (basic variables) 


fr 


X4 — X5 = xe = 0 

164,000 


19 ’ _ 19 

(nonbasic variables) 

and maximum profit : 


$164,000 


19 


19 
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4.6 TRANSPORTATION PROBLEM 


This section deals with an important class of LP problems called the transportation 
problem. As the name indicates, a transportation problem is one in which the objec- 
tive for minimization is the cost of transporting a certain commodity from a number 
of origins to a number of destinations. Although the transportation problem can be 
solved using the regular simplex method, its special structure offers a more convenient 
procedure for solving this type of problems. This procedure is based on the same the- 
ory of the simplex method, but it makes use of some shortcuts that yield a simpler 
computational scheme. 

Suppose that there are m origins R\ , AS, ■ • • , R„, (e.g., warehouses) and n des- 
tinations, D\, £> 2 , ■ ■ ■ , D n (e.g., factories). Let a,- be the amount of a commodity 
available at origin i (i — 1 , 2 ,..., m ) and bj be the amount required at destination 
j (j — 1,2,..., n). Let c (/ - be the cost per unit of transporting the commodity from 
origin i to destination j. The objective is to determine the amount of commodity (x !; ) 
transported from origin i to destination j such that the total transportation costs are 
minimized. This problem can be formulated mathematically as 


subject to 


m n 

Minimize / = EE* 
i=l i=i 

n 

Y, x,j = a , , i = l,2, ...,m 
1=1 
m 

yxij=bj , j = 1,2, ... ,n 

i = 1 


(4.52) 


(4.53) 

(4.54) 


Xjj > 0, i — 1,2, ... ,m, j — 1,2, ... ,n 


(4.55) 


Clearly, this is a LP problem in mn variables and m + n equality constraints. 

Equations (4.53) state that the total amount of the commodity transported from 
the origin i to the various destinations must be equal to the amount available at origin 
i (i = 1,2, ... , m), while Eqs. (4.54) state that the total amount of the commodity 
received by destination j from all the sources must be equal to the amount required at 
the destination j ( j — 1,2,..., n). The nonnegativity conditions Eqs. (4.55) are added 
since negative values for any Xjj have no physical meaning. It is assumed that the total 
demand equals the total supply, that is, 

m n 

I> = X> (4.56) 

1=1 l=i 


Equation (4.56), called the consistency condition, must be satisfied if a solution is to 
exist. This can be seen easily since 


m m 


E°' = E 



EE 

l=i 'i=i 


77 




(4.57) 
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The problem stated in Eqs. (4.52) to (4.56) was originally formulated and solved by 
Hitchcock in 1941 [4.6]. This was also considered independently by Koopmans in 
1947 [4.7]. Because of these early investigations the problem is sometimes called the 
Hitchcock-Koopmans transportation problem . The special structure of the transportation 
matrix can be seen by writing the equations in standard form: 

'til + X\2 + ' ' ' + X\ n 

X21 + X22 + ' ' ' + X2n 


Xm i 4" x in 2 4- ■ ■ ■ 4- x mn — a in 


— a \ 

= a 2 

(4.58a) 


-til 4- *21 +Xml =b\ 

X\2 +X22 +X m2 =b 2 


X\n 


" X 2 n 


4" x mn — b n 


(4.58Z?) 


Cll-tll 4- C12-T12 4- ' ' • 4- C\ n X\ n + C 21 X 21 4- • • ' 4“ C 2n x 2 n 4” • ■ ■ 

4~ c m \x m \ + ■ ■ ■ 4- c mn x mn — f (4.58c) 


We notice the following properties from Eqs. (4.58): 

1. All the nonzero coefficients of the constraints are equal to 1 . 

2. The constraint coefficients appear in a triangular form. 

3. Any variable appears only once in the first m equations and once in the next n 
equations. 

These are the special properties of the transportation problem that allow devel- 
opment of the transportation technique. To facilitate the identification of a starting 
solution, the system of equations (4.58) is represented in the form of an array, called 
the transportation array, as shown in Fig. 4.2. In all the techniques developed for solv- 
ing the transportation problem, the calculations are made directly on the transportation 
array. 

Computational Procedure. The solution of a LP problem, in general, requires a 
calculator or, if the problem is large, a high-speed digital computer. On the other hand, 
the solution of a transportation problem can often be obtained with the use of a pencil 
and paper since additions and subtractions are the only calculations required. The basic 
steps involved in the solution of a transportation problem are 

1. Determine a starting basic feasible solution. 

2. Test the current basic feasible solution for optimality. If the current solution is 
optimal, stop the iterative process; otherwise, go to step 3. 

3. Select a variable to enter the basis from among the current nonbasic variables. 
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\ ' To 

Destination j 

Amount 

From 


1 

2 

3 


n 

available 

a i 


1 

*11 


*12 


*13 



*1 n 






c n 


c ^2 


c 13 



C 1 n 

a \ 


2 

*21 


Xpp 


*23 



x 2 n 



Origin 

i 



C 21 


c 22 


c 23 



c 2n 

a 2 

3 

*31 


*32 


CO 

CO 



*3 n 






C 31 


c 32 


c 33 



c 3 n 

a 3 










m 

**771 1 


*m2 


*m3 



x mn 






C m'\ 


c m2 


c m3 



c mn 

a m 

Amount 

required 

b i 

b ^ 

b 2 

b 3 


b n 



Figure 4.2 Transportation array. 


4. Select a variable to leave from the basis from among the current basic variables 
(using the feasibility condition). 

5. Find a new basic feasible solution and return to step 2. 

The details of these steps are given in Ref. [4.10]. 


4.7 KARMARKAR'S INTERIOR METHOD 

Karmarkar proposed a new method in 1984 for solving large-scale linear programming 
problems very efficiently. The method is known as an interior method since it finds 
improved search directions strictly in the interior of the feasible space. This is in 
contrast with the simplex method, which searches along the boundary of the feasible 
space by moving from one feasible vertex to an adjacent one until the optimum point 
is found. For large LP problems, the number of vertices will be quite large and hence 
the simplex method would become very expensive in terms of computer time. Along 
with many other applications, Karmarkar’ s method has been applied to aircraft route 
scheduling problems. It was reported [4.19] that Karmarkar’ s method solved problems 
involving 150,000 design variables and 12,000 constraints in 1 hour while the simplex 
method required 4 hours for solving a smaller problem involving only 36,000 design 
variables and 10,000 constraints. In fact, it was found that Karmarkar’ s method is as 
much as 50 times faster than the simplex method for large problems. 


4.7 Karmarkar’ s Interior Method 
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Minimum value of f 



“ *1 

Figure 4.3 Improvement of objective function from different points of a polytope. 


Karmarkar’ s method is based on the following two observations: 

1. If the current solution is near the center of the polytope, we can move along the 
steepest descent direction to reduce the value of / by a maximum amount. From 
Fig. 4.3, we can see that the current solution can be improved substantially by 
moving along the steepest descent direction if it is near the center (point 2) but 
not near the boundary point (points 1 and 3). 

2. The solution space can always be transformed without changing the nature of 
the problem so that the current solution lies near the center of the polytope. 

It is well known that in many numerical problems, by changing the units of data or 
rescaling (e.g., using feet instead of inches), we may be able to reduce the numerical 
instability. In a similar manner, Karmarkar observed that the variables can be trans- 
formed (in a more general manner than ordinary rescaling) so that straight lines remain 
straight lines while angles and distances change for the feasible space. 

4.7.1 Statement of the Problem 

Karmarkar’ s method requires the LP problem in the following form: 

Minimize / = C T X 


subject to 


[a]X = 0 


X\ + X2 + ' ' ‘ + X n — 1 

X > 0 


(4.59) 
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where X = {xi, X 2 , ■ ■ ■ , x„} T , C = {ci, C 2 , . . . , c„} T , and [a] is an m x n matrix. In 
addition, an interior feasible starting solution to Eqs. (4.59) must be known. Usually, 


X = 


1 1 


1 

n 


T 


is chosen as the starting point. In addition, the optimum value of / must be zero for 
the problem. Thus 


X (1) = 


1 

i 

n 

n 


= interior feasible 


(4.60) 


/min = 0 


Although most LP problems may not be available in the form of Eq. (4.59) while 
satisfying the conditions of Eq. (4.60), it is possible to put any LP problem in a form 
that satisfies Eqs. (4.59) and (4.60) as indicated below. 


4.7.2 Conversion of an LP Problem into the Required Form 

Let the given LP problem be of the form 

Minimize d T X 


subject to 


[cr]X = b 

X > 0 


(4.61) 


To convert this problem into the form of Eq. (4.59), we use the procedure suggested 
in Ref. [4.20] and define integers m and n such that X will be an (n — 3)-component 
vector and [a] will be a matrix of order m — 1 x n — 3. We now define the vector 
z = {zi, Z 2 , ■ ■ ■ , z„- 3} t as 


z 


X 

1 


(4.62) 


where / is a constant chosen to have a sufficiently large value such that 


n — 3 

p > XI Xi 

1=1 


(4.63) 


for any feasible solution X (assuming that the solution is bounded). By using Eq. (4.62), 
the problem of Eq. (4.61) can be stated as follows: 

Minimize / d T z 


subject to 

[Q!]Z = lb 
P 

z > 0 


(4.64) 
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We now define a new vector z as 


z 

Zn—2 
Zn — 1 
Zn 


and solve the following related problem instead of the problem in Eqs. (4.64): 

Minimize {ySd T 0 0 M) z 


subject to 




n . 

( n \ 


[O'] 

0 

~~p b 


111 

_ 0 

0 

n 

0 


6 Z 4" Zn—2 4" Zn — 1 4” Zn — 1 

z > 0 


(4.65) 


where e is an (m — l)-component vector whose elements are all equal to 1, z „-2 is a 
slack variable that absorbs the difference between 1 and the sum of other variables, 
z „- 1 is constrained to have a value of 1 In, and M is given a large value (corresponding 
to the artificial variable z„) to force z„ to zero when the problem stated in Eqs. (4.61) 
has a feasible solution. Equations (4.65) are developed such that if z is a solution to 
these equations, X = pi will be a solution to Eqs. (4.61) if Eqs. (4.61) have a feasible 
solution. Also, it can be verified that the interior point z = (1 /n)e is a feasible solution 
to Eqs. (4.65). Equations (4.65) can be seen to be the desired form of Eqs. (4.61) except 
for a 1 on the right-hand side. This can be eliminated by subtracting the last constraint 
from the next-to-last constraint, to obtain the required form: 


subject to 


Minimize {/Sd T 0 0 M) z 


„ n „ 

[a] 0 --b 

P 

-e T -1 (n - 1) 




-1 



e T Z + Zn—2 4- Zn- 1 4" Zn — 1 


z > 0 


(4.66) 


Note: When Eqs. (4.66) are solved, if the value of the artificial variable z„ > 0, 
the original problem in Eqs. (4.61) is infeasible. On the other hand, if the value 
of the slack variable z n -2 — 0, the solution of the problem given by Eqs. (4.61) is 
unbounded. 


Example 4.12 Transform the following LP problem into a form required by Kar- 
markar’ s method: 


Minimize 2x\ + 3x2 
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subject to 


3 xi + X2 — 2x3 = 3 
5xi - 2x2 = 2 
x, >0, i — 1, 2, 3 


SOLUTION It can be seen that 

xt 
X2 ■ 

X 3 

We define the integers m and n as n = 6 and m = 3 and choose /l = 10 so that 


d = {2 3 0} T , [a] = 


3 1 -2 

5-2 0 


b = 


, and X 


2 


! 

To 


z 1 

Z2 

Z3 


Noting that e = {1. 1, 1} T , Eqs. (4.66) can be expressed as 


Minimize {20 30 0 0 0 M} z 


subject to 



'3 1 

-2 

f°l 

6 

I 3 ) 


5 -2 

0 

{oj 

- To 

i 2 J 



3 

5 


1 -2 

-2 0 


z = 0 


{-{1 1 1} - 1 5 — l}z = 0 

Zl + Z2 + Z3 + Z4 + Z5 + Z6 =1 


Z = {zi Z2 Z3 Z4 Z5 > 0 


where M is a very large number. These equations can be seen to be in the desired 
form. 


4.7.3 Algorithm 

Starting from an interior feasible point X (l) , Karmarkar’s method finds a sequence of 
points X l2) , X (3) , • • ■ using the following iterative procedure: 

1. Initialize the iterative process. Begin with the center point of the simplex as the 
initial feasible point 

X'D = ji i . . . i) T . 

[n n n 


Set the iteration number as k = 1 . 
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2. Test for optimality. Since / = 0 at the optimum point, we stop the procedure 
if the following convergence criterion is satisfied: 


C r X w || < s 


(4.67) 


where s is a small number. If Eq. (4.67) is not satisfied, go to step 3. 

3. Compute the next point, X (i ' +I) . For this, we first find a point Y (i+1) in the 
transformed unit simplex as 


Y(*+D = 


1 

1 

n 

n 


«([/] ~ [/ > ] T ([^][/ > ] T )~ 1 [^])[^>(X^>)]C 
llcll s/ti(n - 1) 


(4.68) 


where ||C|| is the length of the vector C, [/] the identity matrix of order n, 
[D(X W )] an n x n matrix with all off-diagonal entries equal to 0 , and diagonal 
entries equal to the components of the vector X ,k) as 

[i)(X ( ‘ ) )]ii =xf\ i = 1 , 2 , . . . , n (4.69) 


[E] is an (m + 1 ) xn matrix whose first m rows are given by [cz] [T>(X®)] 
and the last row is composed of 1 ’ s: 


[P] = 


[a][D(X<«)] 

1 ••• 1 


(4.70) 


and the value of the parameter a is usually chosen as a = 4 to ensure con- 
vergence. Once Y (k+i> is found, the components of the new point X (/;+l) are 
determined as 


r c^+0 

x i 


Y (k) (k+l) 
A i Ji 


E n 

r= 


(k)(k+ 1 )’ 
yr 


i — 1 , 2 , ,n 


(4.71) 


Set the new iteration number as k — k + 1 and go to step 2. 


Example 4.13 Find the solution of the following problem using Karmarkar’ s method: 

Minimize / = 2x\ + xj — X 3 


subject to 


x 2 — X 3 = 0 
X\ + x 2 + x 2 — 1 

Xi> 0, i — 1, 2, 3 (E.l) 

Use the value of e = 0.05 for testing the convergence of the procedure. 

SOFUTION The problem is already in the required form of Eq. (4.59), and hence 
the following iterative procedure can be used to find the solution of the problem. 
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Step 1 We choose the initial feasible point as 


X (1 > - 


and set k — 1. 
Step 2 Since |/(X«)| = 
Step 3 Since [«] = {0, 
we find that 


||| >0.05, we go to step 3. 


1, -1}, C = {2, 1, -1} T , ||C|| = V(2)2 + (l) 2 + (-1) 


[£>(X ( ')] 


(Dm _ 


iOO 

0 | 0 


M[D(X^)] 


0)^ _ 


0 0 i 


{0 \ - i} 


'M[£>(XO))]- 


ro i -ii 

o 3 3 

1 1 1 


1 1 1 


[P] = 


([PUP?)- 1 = 


- o" 

9 U 

-l 

1 o" 

_0 3_ 


.0 L 



~A 0 O' 


' 2' 


2 

3 

[D(X (1) )]C = 

0 | 0 
_0 0 i_ 


1 

-1 

' — ‘ 

1 

3 

1 

3 


(v 




cm - [ p ?([ pup ?)- 1 [ p ])[ d ( x {1) )]c 

■A 


1 0 0 
0 1 0 
0 0 1 


0 1 
J 1 


-A 1 


1 o n 

L 0 Ll 


0 i 


1 1 1 


r 2 

1 

in 


2 


4 

3 

3 

3 


3 


9 

1 

1 

1 


1 


2 

3 

6 

6 


3 


9 

1 

1 

1 


1 


2 

L 3 

6 

6 J 


3 


9 


Using a — j, Eq. (4.68) gives 


Y (2) = 


1 


4 

3 


9 

1 

1 

2 

3 

4 ' 

9 

1 


2 

3 


9 


V3(2)V6 


34 

108 

37 

108 

37 

108 


2 = V6, 
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Noting that 


£ 

r=l 






3£ )+ l/37v , 1 / 
108 ' ^ 3 ^ 108 ' ' 3 ' 


1 t 37 
108 


) = 


Eq. (4.71) can be used to find 


{x, (2) } = 




34 


34 

(1) (2) 


324 


108 

x i y. 

• = 3 ■ 

37 

324 


37 

3 


108 

E -^V 2) 

. r=l 


37 

324 


37 

108 


Set the new iteration number as k = k + 1 =2 and go to step 2. The procedure 
is to be continued until convergence is achieved. 


Notes: 


1. Although X <2) = Y’ ' in this example, they need not be, in general, equal to 
one another. 

2. The value of / at X (2) is 


f/V(2K _ o/_34_\ I _37_ _ _37_ _ H f/V(l)\ — — 
j iqcI i 108 108 — 27 ^ j > — 27 


4.8 QUADRATIC PROGRAMMING 

A quadratic programming problem can be stated as 

Minimize /(X) = C T X + ±X r DX (4.72) 

subject to 


where 


AX < B 
X > 0 



x\ 


C 1 


b x ' 

X = 

X2 

• , C = ■ 

C2 

, B = 

b 2 


X n 


Cn 


bm 


d\\ 

d\2 ■ 

• d\ n 


an 

a 1 2 ■ 


d-2\ 

dr2 ■ 

din 

, and A = 

<221 

«22 ‘ 

din 

dnl 

dn2 ■ 

m d nn 


am 1 

am2 ‘ 

’ ci mn 


(4.73) 

(4.74) 


In Eq. (4.72) the term X r DX/2 represents the quadratic part of the objective 
function with D being a symmetric positive-definite matrix. If D = 0, the problem 
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reduces to a LP problem. The solution of the quadratic programming problem stated 
in Eqs. (4.72) to (4.74) can be obtained by using the Lagrange multiplier technique. 
By introducing the slack variables sf, i = 1.2,.... m, in Eqs. (4.73) and the surplus 
variables tj, j = 1,2, ... ,n, in Eqs. (4.74), the quadratic programming problem can 
be written as 


Minimize /(X) = C T X + ^X T DX 
subject to the equality constraints 

A^X + sf = bj, i — 1,2, 


x j +tj — 0, 7 — 1, 2, 


where 


an 

an 

&in 


The Lagrange function can be written as 


m 

n 


L(X,S,T.A.,0) = CX + 4X T DX + £A,-(A/X + sf - bfi 


i = 1 


+ T! ft; ( ~ x i + r j) 

7=i 

The necessary conditions for the stationariness of L give 

r, r n m 

O L-! ^ ^ > 

— — = Cj + 2 ^ dijXi + 2_^ A. iOij — 9j — 0, 7 = 1, 2, 


i = l 


7 = 1 


3x 
3 L 

— = 2 XjSj — 0, i — \ , 2, ... ,m 
dsi 


— = 29 j tj - 0, 
3 tj J J 


j = 1,2, ... ,n 


3 L 
3l~ 
3 L 


— A ^X + sf — bj — 0, i — \ , 2, ... ,m 


— ~ Xj + tj = 0, j = 1 , 2, . . . , n 


3 1 J 

By dehning a set of new variables 7, as 

Yj = sf > 0, i — 1,2, ... , m 


(4.72) 

(4.75) 

(4.76) 


(4.77) 

(4.78) 

(4.79) 

(4.80) 

(4.81) 

(4.82) 

(4.83) 


Equations (4.81) can be written as 

A?X - bi = -sf = -Y U 


i = 1,2 , ... ,m 


(4.84) 


4.8 Quadratic Programming 231 


Multiplying Eq. (4.79) by Sj and Eq. (4.80) by tj, we obtain 

Xisf — Xj Yj = 0, i = 1,2, , m (4.85) 

djtj = 0, j = l,2,...,n (4.86) 

Combining Eqs. (4.84) and (4.85), and Eqs. (4.82) and (4.86), we obtain 

k f (AjX -bi) =0, / = 1, 2, . . . , m (4.87) 

OjXj — 0, j = 1, 2, . . . , n (4.88) 

Thus the necessary conditions can be summarized as follows: 

n m 

Cj — 6j + ^ Xjdjj + ^ A .jdij =0, j = 1, 2, . . . , n (4.89) 

/ — 1 i=l 

AjX -bj = -Yj, i = 1,2,..., m (4.90) 

Xj > 0, j = 1, 2, . . . , n (4.91) 

Yj > 0, i — 1, 2, . . . , m (4.92) 

A, > 0, ( = 1,2, ...,m (4.93) 

0 ; >O, j — 1,2, ... ,n (4.94) 

A ,- Yj =; 0, i — 1,2, ... ,m (4.95) 

GjXj — 0, j = 1,2, . . . , n (4.96) 


We can notice one important thing in Eqs. (4.89) to (4.96). With the exception of 
Eqs. (4.95) and (4.96), the necessary conditions are linear functions of the variables 
Xj , Yj, Xj, and dj. Thus the solution of the original quadratic programming problem 
can be obtained by Ending a nonnegative solution to the set of m + n linear equations 
given by Eqs. (4.89) and (4.90), which also satisfies the m + n equations stated in Eqs. 

(4.95) and (4.96). 

Since D is a positive-definite matrix, /(X) will be a strictly convex function,^ and 
the feasible space is convex (because of linear equations), any local minimum of the 
problem will be the global minimum. Further, it can be seen that there are 2 (n + m) 
variables and 2 (n + m) equations in the necessary conditions stated in Eqs. (4.89) to 

(4.96) . Hence the solution of the Eqs. (4.89), (4.90), (4.95), and (4.96) must be unique. 
Thus the feasible solution satisfying all the Eqs. (4.89) to (4.96), if it exists, must give 
the optimum solution of the quadratic programming problem directly. The solution 
of the system of equations above can be obtained by using phase I of the simplex 
method. The only restriction here is that the satisfaction of the nonlinear relations, Eqs. 
(4.95) and (4.96), has to be maintained all the time. Since our objective is just to find 
a feasible solution to the set of Eqs. (4.89) to (4.96), there is no necessity of phase 
II computations. We shall follow the procedure developed by Wolfe [4.21] to apply 


f See Appendix A for the definition and properties of a convex function. 
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phase I. This procedure involves the introduction of n nonnegative artificial variables 
Zj into the Eqs. (4.89) so that 

n m 

c j - Oj + y^Xjdjj + YMij + Zj — 0, 7 = 1,2, n (4.97) 

i= 1 i=l 

Then we minimize 

n 

f = Y z j (4 - 98) 

i=i 

subject to the constraints 

n m 

Cj - Oj + ^ Xidij + Yj h a 'j + Zj — 0, j — 1,2, ... ,n 

i = 1 ( = 1 

A^X + Yj — bj, i = 1,2,..., m 

X>0, Y >0, X>0, 0 >0 


While solving this problem, we have to take care of the additional conditions 


/.,■ Yj — 0, j = 1, 2, . . . , m 
OjXj =0, j — 1,2, ... ,n 


(4.99) 


Thus when deciding whether to introduce Y t into the basic solution, we first have to 
ensure that either Xj is not in the solution or A,- will be removed when Y, enters the 
basis. Similar care has to be taken regarding the variables 0j and xj. These additional 
checks are not very difficult to make during the solution procedure. 


Example 4.14 

subject to 


Minimize / = — 4xi + x\ — 2x\X2 + 2x z 

2xi + X 2 < 6 
x\ — 4.12 < 0 
X\ > 0, X2 > 0 


SOLUTION By introducing the slack variables Y\ — and Yo — sj and the surplus 
variables 6\ — t\ and 9 2 = t%, the problem can be stated as follows: 


subject to 


Minimize / = (—4 0) P 1 

yx 2 


+ i ( X! 



-2 

4 




(Hi) 


— x\ + 0\ — 0 
—X2 + 02 = 0 
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By comparing this problem with the one stated in Eqs. (4.72) to (4.74), we find that 


Cl 


-4, 


c 2 = 0 , 


D = 


2 

-2 

A 

2 

f 

-2 

4 

, A = 

1 

-4 


A, - 



*2 


and B 


The necessary conditions for the solution of the problem stated in Eqs. (EQ can be 
obtained, using Eqs. (4.89) to (4.96), as 


— 4 — 0 \ + 2xi — 2x2 ”b 27. \ -)- 7.2 — 0 
0 — 0 2 — 2x i + 4x2 + 7 1 — 47-2 — 0 

2xi + X 2 — 6 = —Ei 
x i - 4x 2 - o = -Y 2 


(E 2 ) 


Xi >0, x 2 > 0, Y 1 >0, Y 2 > 0, 7.1 > 0, 
X 2 >0, 0\ > 0, d 2 > 0 


(E 3 ) 


A]! 7 ] = 0, 9\X\ = 0 

X 2 Y 2 — 0, 02*2 = 0 


(E 4 ) 


(If Y, is in the basis, 7, cannot be in the basis, and if xj is in the basis, 6j cannot be 
in the basis to satisfy these equations.) Equations (E 2 ) can be rewritten as 


2xi — 2x2 + 27] +72 — 01 + zi — 4 

— 2xj + 4x'2 + 7j — 472 — 02 + z 2 = 0 

2xj + X 2 + Y\ — 6 

xi - 4x 2 + T 2 = 0 


(Es) 


where z.\ and z 2 are artificial variables. To find a feasible solution to Eqs. (E 2 ) to (E 4 ) 
by using phase I of simplex method, we minimize w = zi + z 2 with constraints stated 
in Eqs. (E 5 ), (E 3 ), and (E 4 ). The initial simplex tableau is shown below: 


Basic 




Variables 








^ is 

variables 

XI 

X2 

X\ 

^2 

0i 

#2 

Tl 

Yi 

Zl 

Z2 

w 

bi 

for Oi s > 0 

Yi 

2 

1 

0 

0 

0 

0 

1 

0 

0 

0 

0 

6 

6 

y 2 

1 

-4 

0 

0 

0 

0 

0 

1 

0 

0 

0 

0 


Zl 

2 

-2 

2 

1 

-1 

0 

0 

0 

1 

0 

0 

4 


Z2 

-2 


1 

-4 

0 

-1 

0 

0 

0 

1 

0 

0 

0 <— Smaller 
one 

—w 

0 

-2 

-3 

3 

1 

1 

0 

0 

0 

0 

1 

-4 



t t 


X 2 selected for Most negative 

entering next basis 
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According to the regular procedure of simplex method, Ai enters the next basis since 
the cost coefficient of X\ is most negative and zi leaves the basis since the ratio b, jTi ls 
is smaller for zi ■ However, /. i cannot enter the basis, as Y\ is already in the basis [to 
satisfy Eqs. (E4)]. Hence we select x 2 for entering the next basis. According to this 
choice, Z2 leaves the basis. By carrying out the required pivot operation, we obtain the 
following tableau: 


Basic 





Variables 







bi / ^is 

variables 

Xi 

*2 


^2 

6\ 0 2 

Tl 

Y 2 

Zl 

z 2 

W 

bi 

for a is > 0 

Ti 

0 

0 

1 

4 

1 

0 5 

1 

0 

0 

1 

4 

0 

6 

<- Smaller 
one 

y 2 

-1 

0 

1 

-4 

0 -1 

0 

1 

0 

1 

0 

0 


Zl 

1 

0 

5 

2 

-1 

-1 -i 

0 

0 

1 

1 

2 

0 

4 

4 

x 2 

1 

2 

1 

1 

4 

-1 

0 4 

0 

0 

0 

1 

4 

0 

0 


— w 

-1 

0 

5 

2 

1 

1 5 

0 

0 

0 

1 

2 

1 

-4 



t t 


x\ selected to Most negative 

enter the basis 

This tableau shows that /, 1 has to enter the basis and Y 2 or x 2 has to leave the basis. 
However, A| cannot enter the basis since Y\ is already in the basis [to satisfy the 
requirement of Eqs. (E4)]. Hence x\ is selected to enter the basis and this gives Y\ as 
the variable that leaves the basis. The pivot operation on the element | results in the 
following tableau: 


Basic 





Variables 







bi / Clis 

variables 

Xl 

x 2 


^2 

9 i 

Ol 

Tl 

y 2 

Zl 

z 2 

W 

bi 

for a is > 0 

Xl 

1 

0 

1 

10 

2 

5 

0 

1 

10 

2 

5 

0 

0 

1 

10 

0 

12 

5 


y 2 

0 

0 

9 

10 

18 

5 

0 

9 

10 

2 

5 

1 

0 

9 

10 

0 

12 

5 

8 

3 

Zl 

0 

0 

13 

5 

7 

5 

-1 

3 

5 

2 

5 

0 

1 

3 

5 

0 

8 

5 

-^-Smaller 














one 

x 2 

0 

1 

1 

5 

4 

5 

0 

1 

5 

1 

5 

0 

0 

1 

5 

0 

6 

5 

6 

—w 

0 

0 

13 

5 

7 

5 

1 

3 

5 

2 

5 

0 

0 

2 

5 

1 

8 

5 



t 

Most negative 


From this tableau we find that /, 1 enters the basis (this can be permitted this time since 
Y\ is not in the basis) and zi leaves the basis. The necessary pivot operation gives the 
following tableau: 
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Basic 





Variables 







bi/a is 

variables 

*1 

*2 

^1 

^2 

01 

e 2 

Tl 

y 2 

Zl 

Z2 

W 

bi 

for Ui s >0 

*1 

1 

0 

0 

9 

26 

1 

26 

1 

13 

5 

13 

0 

1 

26 

1 

13 

0 

32 

13 


y 2 

0 

0 

0 

81 

26 

9 

26 

9 

13 

7 

13 

1 

9 

26 

9 

13 

0 

24 

13 


Xi 

0 

0 

1 

7 

13 

5 

13 

3 

13 

2 

13 

0 

5 

13 

3 

13 

0 

8 

13 


*2 

0 

1 

0 

9 

13 

1 

13 

2 

13 

3 

13 

0 

1 

13 

2 

13 

0 

14 

13 


—w 

0 

0 

0 

0 

0 

0 

0 

0 

1 

1 

1 

0 



Since both the artificial variables z\ and 7.2 are driven out of the basis, the present tableau 
gives the desired solution as x\ = X 2 = Y 2 = A| = ^ (basic variables), 
A . 2 = 0, Fj = 0, d| = 0, 07 = 0 (nonbasic variables). Thus the solution of the original 
quadratic programming problem is given by 

V* 32 * 14 1 r . f/ r * y*\ 88 

A 1 — 13 ? — 13 ’ aiAU J min — J V A i ? A 2 ' — 13 


4.9 MATLAB SOLUTIONS 

The solutions of linear programming problems, based on interior point method, and 
quadratic programming problems using MATLAB are illustrated by the following 
examples. 

Example 4.15 Find the solution of the following linear programming problem using 
MATLAB (interior point method): 


Minimize / = — jci — 2*2 — *3 


subject to 

2*i + *2 — *3 <2 
2*i — *2 + 5*3 < 6 
4*i + *2 + *3 < 6 
*, > 0 ; ( = 1,2,3 

SOLUTION 

Step 1 Express the objective function in the form /(*) = / T * and identify the vectors 
* and / as 


*1 


-1 

*2 

and / = 

-2 

*3 


-1 


* = 
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Express the constraints in the form A x < b and identify the matrix A and the 
vector b as 


2 

1 

-f 


'2 

2 

-1 

5 

and b = 

6 

4 

1 

1 


6 


Step 2 Use the command for executing linear programming program using interior 
point method as indicated below: 

clc 

clear all 
f = [ - 1 ; - 2 ; - 1 ] ; 

A= [ 2 1-1; 

2-1 5; 

4 1 1]; 
b=[2; 6; 6] ; 
lb=zeros (3,1) ; 

Aeq= [ ] ; 
beq= [ ] ; 

options = optimset (' Display ' , 'iter'); 

[x, fval, exitflag, output] = linprog (f , A, b, Aeq, beq, lb, [ ] , [ ] , 
options) 


This produces the solution or ouput as follows: 


Iter 0 
Iter 1 
Iter 
Iter 
Iter 
Iter 


1 . 03e+003 7 . 97e+000 1.50e+003 4.00e+002 
4 . lle+002 2.22e-016 2.78e+002 4.72e+001 


1 . 16e-013 1 . 90e- 015 2. 
1 . 7 8e - 0 15 1 . 80e- 015 3. 
7 . 4 8e - 0 14 1.02e-015 1. 
2 . 51e - 0 15 4 . 62e- 015 1. 
Optimization terminated, 
x = 


85e+000 2 . 33e- 001 
96e - 002 3 . 96e - 003 
99e - 00 6 1 . 99e - 007 
99e - 012 1 . 98e - 0 13 


0 .0000 

4 .0000 

2 .0000 

fval =-10.0000 
exitflag= 1 
output = 

iterations: 5 

algorithm: 'large-scale: interior point' 

cgiterations : 0 

message: 'Optimization terminated.' 
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Example 4.16 Find the solution of the following quadratic programming problem 
using MATLAB: 

Minimize / = —4xj + xf — 1x\X2 + 1x\ 
subject to 2x\ + xi < 6, x\ — 4xi < 0, x\ > 0, xt > 0 


SOLUTION 

Step 1 Express the objective function in the form f(x) — jx r Hx + f T x and identify 
the matrix H and vectors / and x: 



Step 2 State the constraints in the form: Ax < b and identify the matrix A and vector 



Step 3 Use the command for executing quadratic programming as 

[x,fval] = quadprog (H, f , A, b) 

which returns the solution vector x that minimizes 

/ = jX T Hx + f r x subject to Ax < b 

The MATLAB solution is given below: 

clear; clc; 

H= [2-2; -2 4] ; 
f = [ - 4 0 ] ; 

A= [2 1 ; 1 — 4 ] ; 
b= [6; 0] ; 

[x,fval] = quadprog (H, f, A, b) 

Warning: Large-scale method does not currently solve this 
problem formulation, switching to medium-scale method, 
x = 

2 .4615 
1.0769 
fval = 

-6.7692 
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REVIEW QUESTIONS 

4.1 Is the decomposition method efficient for all LP problems? 

4.2 What is the scope of postoptimality analysis? 

4.3 Why is Karmarkar’s method called an interior method? 

4.4 What is the major difference between the simplex and Karmarkar methods? 

4.5 State the form of LP problem required by Karmarkar’s method. 

4.6 What are the advantages of the revised simplex method? 

4.7 Match the following terms and descriptions: 


(a) Karmarkar’s method 

(b) Simplex method 

(c) Quadratic programming 

(d) Dual simplex method 

(e) Decomposition method 


Moves from one vertex to another 
Interior point algorithm 
Phase I computations not required 
Dantzig and Wolfe method 
Wolfe’s method 


4.8 Answer true or false: 

(a) The quadratic programming problem is a convex programming problem. 

(b) It is immaterial whether a given LP problem is designated the primal or dual. 

(c) If the primal problem involves minimization of / subject to greater-than constraints, 
its dual deals with the minimization of / subject to less-than constraints. 

(d) If the primal problem has an unbounded solution, its dual will also have an unbounded 
solution. 

(e) The transportation problem can be solved by simplex method. 

4.9 Match the following in the context of duality theory: 


(a) X,- is nonnegative 

/th constraint is of less-than or 


equal-to type 

(b) x; is unrestricted 

Maximization type 

(c) /th constraint is of equality type 

t th variable is unrestricted 

(d) /th constraint is of greater-than or 

t th variable is nonnegative 

equal-to type 


(e) Minimization type 

/th constraint is of equality type 


PROBLEMS 

Solve LP problems 4.1 to 4.3 by the revised simplex method. 

4.1 Minimize / = — 5xi + 2x2 + 5 x 3 — 3 x 4 


subject to 


2 xi + X 2 — X 3 = 6 
3xi + 8 x 3 + X 4 = 7 
x; >0, i = 1 to 4 
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4.2 Maximize / = 15 jci 4 - 6x2 + 9x3 + 2x4 
subject to 

10 xi + 5x2 + 25 x 3 + 3x4 < 50 
12 xi T 4x2 -b 12x3 T X4 5 48 
7 xi + X4 <35 

x; > 0 , 1 = 1 to 4 

4.3 Minimize / = 2 xi + 3x2 + 2x3 — X4 + X5 
subject to 

3xi — 3x2 + 4x3 + 2 x 4 — X 5 = 0 
X | T~ X2 ~b X3 ~b 3x4 “b X5 = 2 
X;> 0 , 1 = 1 , 2 , .... 5 

4.4 Discuss the relationships between the regular simplex method and the revised simplex 
method. 

4.5 Solve the following LP problem graphically and by the revised simplex method: 

Maximize / = x 2 

subject to 

— X\ + X2 < 0 
—2xi — 3x2 < 6 
xi , X2 unrestricted in sign 

4.6 Consider the following LP problem: 

Minimize / = 3 xi + X3 + 2x5 

subject to 

xi + X3 — X4 + X5 = — 1 

X2 — 2x3 + 3 x 4 4 - 2x5 = — 2 
Xi > 0 , i = 1 to 5 

Solve this problem using the dual simplex method. 

4.7 Maximize / = 4 xi + 2x2 


subject to 


xi — 2x2 > 2 
xi + 2x2 = 8 
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x\ — Xn < 11 

x\ > 0, X 2 unrestricted in sign 


(a) Write the dual of this problem. 

(b) Find the optimum solution of the dual. 

(c) Verify the solution obtained in part (b) by solving the primal problem graphically. 

4.8 A water resource system consisting of two reservoirs is shown in Fig. 4.4. The flows and 
storages are expressed in a consistent set of units. The following data are available: 


Quantity 

Stream 1 (i = 1) 

Stream 2 (i = 2) 

Capacity of reservoir i 

9 

7 

Available release from 
reservoir i 

9 

6 

Capacity of channel 
below reservoir i 

4 

4 

Actual release from 
reservoir i 

Xl 

X2 


The capacity of the main channel below the confluence of the two streams is 5 units. 
If the benefit is equivalent to $2 x 10 6 and $3 x 10 6 per unit of water released from 
reservoirs 1 and 2, respectively, determine the releases x\ and X 2 from the reserovirs to 
maximize the benefit. Solve this problem using duality theory. 

4.9 Solve the following LP problem by the dual simplex method: 

Minimize f =2x\ + 9x2 + 24*3 + 8x4 + 5.V5 


Stream 1 



Figure 4.4 Water resource system. 
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subject to 

xi +X2 + 2^3 — X 5 — X6 = 1 
— 2 xi + X3 + X4 + X 5 — X-I = 2 
Xi >0, i = 1 to 7 

4.10 Solve Problem 3.1 by solving its dual. 

4.11 Show that neither the primal nor the dual of the problem 

Maximize / = — x\ 4- 2 x 2 

subject to 

— X\ + X2 < — 2 

xi - x 2 < 1 
xi > 0, X2 > 0 

has a feasible solution. Verify your result graphically. 

4.12 Solve the following LP problem by decomposition principle, and verify your result by 
solving it by the revised simplex method: 

Maximize / = 8 xi + 3 x 2 + 8 x 3 + 6 x 4 

subject to 

4xi + 3x2 + X 3 + 3x4 < 16 
4xi — X2 + X3 < 12 
xi + 2 x 2 5 8 
3xi +X 2 < 10 
2 x 3 + 3x4 < 9 
4x3 + x 4 < 12 
xi >0, i = 1 to 4 

4.13 Apply the decomposition principle to the dual of the following problem and solve it: 

Minimize / = 10xi + 2 x 2 + 4 x 3 + 8 x 4 + X 5 


xi + 4 x 2 — X3 > 16 
2 xi + X2 + X3 >4 

3xi + X 4 + X 5 > 8 
xi + 2 x 4 — X 5 > 20 
Xi >0, i = 1 to 5 


subject to 
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4.14 Express the dual of the following LP problem: 

Maximize / = 2xi + X 2 

subject to 

x\ — 2x 2 — 2 
x\ + 2 x 2 = 8 

X\ — X2 < 11 

JC 1 > 0, X 2 is unrestricted in sign 

4.15 Find the effect of changing b = { l™} to {pg 0 } in Example 4.5 using sensitivity analysis. 

4.16 Find the effect of changing the cost coefficients c i and C 4 from —45 and —50 to —40 
and —60, respectively, in Example 4.5 using sensitivity analysis. 

4.17 Find the effect of changing c 1 from —45 to —40 and C 2 from —100 to —90 in Example 
4.5 using sensitivity analysis. 

4.18 If a new product, E, which requires 10 min of work on lathe and 10 min of work on 
milling machine per unit, with a profit of $120 per unit is available in Example 4.5, 
determine whether it is worth manufacturing E. 

4.19 A metallurgical company produces four products. A, B, C, and D, by using copper and 
zinc as basic materials. The material requirements and the profit per unit of each of the 
four products, and the maximum quantities of copper and zinc available are given below: 




Product 


Maximum quantity 
available 

A 

B 

C 

D 

Copper (lb) 

4 

9 

7 

10 

6000 

Zinc (lb) 

2 

1 

3 

20 

4000 

Profit per unit ($) 

15 

25 

20 

60 



Find the number of units of the various products to be produced for maximizing the 
profit. 

Solve Problems 4.20-4.28 using the data of Problem 4.19. 

4.20 Find the effect of changing the profit per unit of product D to $30. 

4.21 Find the effect of changing the profit per unit of product A to $10, and of product B to 
$ 20 . 

4.22 Find the effect of changing the profit per unit of product B to $30 and of product C to 
$25. 

4.23 Find the effect of changing the available quantities of copper and zinc to 4000 and 
60001b, respectively. 

4.24 What is the effect of introducing a new product, E, which requires 61b of copper and 
3 lb of zinc per unit if it brings a profit of $30 per unit? 
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4.25 Assume that products A, B, C, and D require, in addition to the stated amounts of copper 
and zinc, 4, 3, 2 and 5 lb of nickel per unit, respectively. If the total quantity of nickel 
available is 2000 lb, in what way the original optimum solution is affected? 

4.26 If product A requires 5 lb of copper and 3 lb of zinc (instead of 4 lb of copper and 2 lb 
of zinc) per unit, find the change in the optimum solution. 

4.27 If product C requires 5 lb of copper and 4 lb of zinc (instead of 7 lb of copper and 3 lb 
of zinc) per unit, find the change in the optimum solution. 

4.28 If the available quantities of copper and zinc are changed to 8000 lb and 5000 lb, respec- 
tively, find the change in the optimum solution. 

4.29 Solve the following LP problem: 

Minimize / = 8xi — 2x2 

subject to 

—4xi + 2x2 < 1 
5xi — 4x2 < 3 

Xi > 0, X2 > 0 

Investigate the change in the optimum solution of Problem 4.29 when the following changes are 
made (a) by using sensitivity analysis and (b) by solving the new problem graphically: 


4.30 

b\ 

= 2 

4.33 

C2 = —4 

4.31 

bi 

= 4 

4.34 

an = -5 

4.32 

Cl 

= 10 

4.35 

«22 = —2 


4.36 Perform one iteration of Karmarkar’s method for the LP problem: 

Minimize f = 2x\ — 2x2 + 5x3 

subject to 

Xi — X2 = 0 
X\ + X2 + X3 = 1 

Xi >0, i = 1,2,3 

4.37 Perform one iteration of Karmarkar’s method for the following LP problem: 

Minimize / = 3xi + 5x2 — 3x3 

subject to 

xi — X3 = 0 
X\ + X2 + X3 = 1 

Xi >0, t = 1,2,3 
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4.38 Transform the following LP problem into the form required by Karmarkar’s method: 

Minimize f = x\ + *2 + *3 

subject to 

x i + X 2 — *3 = 4 
3jti — *2 = 0 
Xi >0, i = 1,2,3 

4.39 A contractor has three sets of heavy construction equipment available at both New York 
and Los Angeles. He has construction jobs in Seattle, Houston, and Detroit that require 
two, three, and one set of equipment, respectively. The shipping costs per set between 
cities i and j ( cy ) are shown in Fig. 4.5. Formulate the problem of finding the shipping 
pattern that minimizes the cost. 

Minimize /(X) = 3*^ + 2x\ + 5*| — 4* 1 xi — 2 * 1*3 — 2 * 2*3 

subject to 

3*i + 5*2 + 2*3 > 10 
3*i + 5*3 < 15 

*; >0, ( = 1,2,3 

by quadratic programming. 

4.41 Find the solution of the quadratic programming problem stated in Example 1.5. 

4.42 According to elastic -plastic theory, a frame structure fails (collapses) due to the formation 
of a plastic hinge mechanism. The various possible mechanisms in which a portal frame 
(Fig. 4.6) can fail are shown in Fig. 4.7. The reserve strengths of the frame in various 
failure mechanisms (Z, ) can be expressed in terms of the plastic moment capacities of the 
hinges as indicated in Fig. 4.7. Assuming that the cost of the frame is proportional to 200 
times each of the moment capacities Mi, M 2 , Me, and M 7 , and 100 times each of the 
moment capacities M 3 , M 4 , and M 5 , formulate the problem of minimizing the total cost 



Figure 4.5 Shipping costs between cities. 
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Pi 






Zi = M\ + Mg + Me + Mj - yP 2 Zg = Ml + M 2 + Mg + M 7 - yP 2 Zg = M\ + 2 M4 + 2 Mg + M7 

- xP\- yP2 

(g) (/<) <i> 



Z 10 = Mi + 2M4 + 2M5 + IW7 

-xP\~yP 2 


(7) 

Figure 4.7 Possible failure mechanisms of a portal frame. 
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to ensure nonzero reserve strength in each failure mechanism. Also, suggest a suitable 
technique for solving the problem. Assume that the moment capacities are restricted as 
0<M;<2x 10 5 lb-in., i = 1, 2, . . . , 7. Data: * = 100 in., y = 150 in., P x = 1000 lb. 
and P 2 = 500 lb. 

4.43 Solve the LP problem stated in Problem 4.9 using MATLAB (interior method). 

4.44 Solve the LP problem stated in Problem 4.12 using MATLAB (interior method). 

4.45 Solve the LP problem stated in Problem 4.13 using MATLAB (interior method). 

4.46 Solve the LP problem stated in Problem 4.36 using MATLAB (interior method). 

4.47 Solve the LP problem stated in Problem 4.37 using MATLAB (interior method). 

4.48 Solve the following quadratic programming problem using MATLAB: 

Maximize / = 2xi + x 2 — xf 

subject to 2xi + 3x 2 < 6, 2xi + x 2 < 4, x\ > 0, x 2 > 0 

4.49 Solve the following quadratic programming problem using MATLAB: 

Maximize f = 4x\ + 6x 2 — x\ — x\ 
subject to xi + x 2 < 2, xi > 0, x 2 > 0 

4.50 Solve the following quadratic programming problem using MATLAB: 

Minimize / = (xi — l) 2 + x 2 — 2 

subject to — x\ + x 2 — 1 = 0, x\ + x 2 — 2 < 0, x\ > 0, x 2 > 0 

4.51 Solve the following quadratic programming problem using MATLAB: 

Minimize / = x^ + x% — 3xix 2 — 6xi + 5x2 
subject to xt + x 2 < 4, 3x t + 6x 2 < 20, x\ > 0, x 2 > 0 


5 


Nonlinear Programming I: 
One-Dimensional Minimization 
M ethods 


5.1 INTRODUCTION 

In Chapter 2 we saw that if the expressions for the objective function and the constraints 
are fairly simple in terms of the design variables, the classical methods of optimization 
can be used to solve the problem. On the other hand, if the optimization problem 
involves the objective function and/or constraints that are not stated as explicit functions 
of the design variables or which are too complicated to manipulate, we cannot solve it 
by using the classical analytical methods. The following example is given to illustrate a 
case where the constraints cannot be stated as explicit functions of the design variables. 
Example 5.2 illustrates a case where the objective function is a complicated one for 
which the classical methods of optimization are difficult to apply. 

Example 5.1 Formulate the problem of designing the planar truss shown in Fig. 5.1 
for minimum weight subject to the constraint that the displacement of any node, in 
either the vertical or the horizontal direction, should not exceed a value 8. 


SOLUTION Let the density p and Young’s modulus E of the material, the length 
of the members /, and the external loads Q, R, and S be known as design data. Let 
the member areas A\, A 2 , . . . , A\\ be taken as the design variables x\, X 2 , ■ ■ ■ , xn, 
respectively. The equations of equilibrium can be derived in terms of the unknown 
nodal displacements m, M2, • • • , miq as + (the displacements un, un, « 13, and uu are 


t According to the matrix methods of structural analysis, the equilibrium equations for the / th member are 
given by [5.1] 

[k ; ] U j = P j 

4x4 4x1 4x1 


where the stiffness matrix can be expressed as 


[k j] = 


AjEj 

h 


cos 2 Gj cosdjsindj —cos 2 dj — cosdjsindj 

cos dj sin dj sin 2 dj — cos dj sin dj — sin 2 dj 

— cos 2 dj — cos dj sin dj cos 2 dj cos dj sin dj 

— cos dj sin dj — sin 2 dj cos dj sin dj sin 2 dj 


where dj is the inclination of the y'th member with respect to the x-axis, Aj the cross-sectional area of the 
j th member, lj the length of the y'th member, U ; the vector of displacements for the _/th member, and P , 
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R 




Ml3 


(b) 

Figure 5.1 Planar truss: (a) nodal and member numbers; ( b ) nodal degrees of freedom. 

zero, as they correspond to the fixed nodes) 

(4.T4 + *6 + * 7) M 1 + V^ 3 (X 6 — Xl)U2 — 4x4 W 3 — X^Uq + VlxqUs =0 (El) 

V3(x 6 - x 7 )n i + 3 (x 6 + x 7 )m 2 + ^3 x 7 w 7 - 3 x 7 w 8 = (E 2 ) 

E 

— 4 .X 4 M 1 + ( 4 x _ 4 + 4 x 5 + *8 + Xg)u^ + y3(X8 — Xg)U 4 — 4x^5 

— X 8 W 7 — V3x 8 a 8 — XgUg + V3xgll]() — 0 (E 3 ) 

n/3(x 8 — X 9 )m 3 + 3(Xs + Xg)u 4 — V3xgUq 

— 3x 8 M 8 + VdxgUg — 3x 9 nio = 0 (E 4 ) 

-4.X5M3 + (4x 5 +*10 + X]l)u 5 + V3(xio - xh)m 6 

r 4gZ 

— x\QUg — V 3xiomio = —=r (E 5 ) 

E 

n/3(xio - xh)m 5 + 3(xi 0 + xh)m 6 - V3xi 0 n 9 - 3xio«io = 0 (E 6 ) 

— X7M1 + V 3 x 7M2 — X8M3 — V3*8«4 + (4xi + 4x 2 

+ x 7 + x 8 )w 7 - -s/3 (x 7 - x 8 )m 8 - 4 x 2 m 9 = 0 (E 7 ) 

\f$XqU\ — 3X7« 2 — V 3 xgUg — 3x 8 W4 — s/3(x 7 — Xg)Uq 

+ 3(x 7 +x 8 )«s = 0 (Eg) 

— XgUg + V 3 x 9M4 — X10W5 — s/ 3 xioM 6 — 4 x 2 M7 

+ (4x 2 + 4x 3 + Xg +X 10 )ug - Vl(xg - X 10 )mio = 0 (E 9 ) 

V3x 9M3 — 3x 9 W4 — V3xioM5 — 3xio«6 — V3(xg — xio)n 9 

4S1 

+ 3(x 9 + xio)nio = — — (E 10 ) 

E 


the vector of loads for the j th member. The formulation of the equilibrium equations for the complete truss 
follows fairly standard procedure [5.1], 
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It is important to note that an explicit closed-form solution cannot be obtained for 
the displacements as the number of equations becomes large. However, given any 
vector X, the system of Eqs. (Ei) to (Eio) can be solved numerically to find the nodal 
displacement u\,u 2 , . . . , hio- 

The optimization problem can be stated as follows: 

11 

Minimize /(X) = £ pxik (En) 

i=i 

subject to the constraints 

£/(X) = I m j (X ) | — S < 0, j = 1,2,..., 10 (E i 2 ) 

Xi > 0, i = 1, 2, . . . , 11 (E 13 ) 

The objective function of this problem is a straightforward function of the design vari- 
ables as given in Eq. (En). The constraints, although written by the abstract expressions 
g 7 (X), cannot easily be written as explicit functions of the components of X. How- 
ever, given any vector X we can calculate g ; (X) numerically. Many engineering design 
problems possess this characteristic (i.e., the objective and/or the constraints cannot be 
written explicitly in terms of the design variables). In such cases we need to use the 
numerical methods of optimization for solution. 


Example 5.2 The shear stress induced along the j-axis when two spheres are in contact 
with each other is given by 


*zx 

P max 


1 

2 




( \ 



z 1 

1 



1 tan 

— 



a 

z 




\ a ) 




(Ei) 


where a is the radius of the contact area and p max is the maximum pressure developed 
at the center of the contact area (Fig. 5.2): 


1-^t 


a — 


3 F E 


1 ~ v l 

E 2 


1/3 


P max 


8 

3 F 
2tt a 2 


1 1 

d\ d 2 


(E 2 ) 

(E 3 ) 


where F is the contact force, E\ and E 2 are Young’s moduli of the two spheres, vi 
and v 2 are Poisson’s ratios of the two spheres, and d\ and d 2 the diameters of the 
two spheres. In many practical applications, such as ball bearings, when the contact 
load (F) is large, a crack originates at the point of maximum shear stress and prop- 
agates to the surface, leading to a fatigue failure. To locate the origin of a crack, it 
is necessary to find the point at which the shear stress attains its maximum value. 
Formulate the problem of finding the location of maximum shear stress for v — v\ — 
v 2 = 0.3. 
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Figure 5.2 Contact stress between two spheres. 


SOLUTION For v\ — vt = 0.3, Eq. (Ei) reduces to 

0.75 , 1 

/ W = ——2 + 0.65A tan” 1 - - 0.65 (E 4 ) 

1 + A z A 

where / = r zx /p max and A = z/a. Since Eq. (E 4 ) is a nonlinear function of the distance, 
A, the application of the necessary condition for the maximum of /, df/dX — 0, gives 
rise to a nonlinear equation from which a closed-form solution for A* cannot easily be 
obtained. In such cases, numerical methods of optimization can be conveniently used 
to find the value of A*. 

The basic philosophy of most of the numerical methods of optimization is to 
produce a sequence of improved approximations to the optimum according to the 
following scheme: 

1. Start with an initial trial point X i . 

2. Find a suitable direction S, (/ = 1 to start with) that points in the general 
direction of the optimum. 

3. Find an appropriate step length A* for movement along the direction S, . 

4. Obtain the new approximation X !+ i as 


x ,' +1 = X/ + A*S; 


(5.1) 


5. Test whether X, + i is optimum. If X, + i is optimum, stop the procedure. 
Otherwise, set a new i = i + 1 and repeat step (2) onward. 
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*2 



The iterative procedure indicated by Eq. (5.1) is valid for unconstrained as well as 
constrained optimization problems. The procedure is represented graphically for a hypo- 
thetical two-variable problem in Fig. 5.3. Equation (5.1) indicates that the efficiency 
of an optimization method depends on the efficiency with which the quantities X* and 
S? are determined. The methods of Ending the step length X* are considered in this 
chapter and the methods of finding S ,■ are considered in Chapters 6 and 7. 

If /(X) is the objective function to be minimized, the problem of determining X* 
reduces to finding the value X t — X* that minimizes /(X, + i) = /(X, + /.,S, ) = /(/.,) 
for fixed values of X, and S Since / becomes a function of one variable A.,- only, the 
methods of finding X* in Eq. (5.1) are called one -dimensional minimization methods. 
Several methods are available for solving a one-dimensional minimization problem. 
These can be classified as shown in Table 5.1. 

We saw in Chapter 2 that the differential calculus method of optimization is an 
analytical approach and is applicable to continuous, twice-differentiable functions. In 
this method, calculation of the numerical value of the objective function is virtually the 
last step of the process. The optimal value of the objective function is calculated after 
determining the optimal values of the decision variables. In the numerical methods 
of optimization, an opposite procedure is followed in that the values of the objective 
function are first found at various combinations of the decision variables and conclu- 
sions are then drawn regarding the optimal solution. The elimination methods can be 
used for the minimization of even discontinuous functions. The quadratic and cubic 
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Table 5.1 One-dimensional Minimization Methods 


Analytical methods Numerical methods 

(differential calculus methods) I 


Elimination Interpolation 

methods methods 


Requiring no Requiring 
derivatives derivatives 
(quadratic) Cubic 

Direct root 
N ewton 
Quasi-Newton 
Secant 


U n restricted 
search 

Exhaustive search 
Dichotomous 
search 

Fibonacci method 
Golden section 
method 


interpolation methods involve polynomial approximations to the given function. The 
direct root methods are root finding methods that can be considered to be equivalent 
to quadratic interpolation. 


5.2 UNIMODAL FUNCTION 

A unimodal function is one that has only one peak (maximum) or valley (minimum) 
in a given interval. Thus a function of one variable is said to be unimodal if, given 
that two values of the variable are on the same side of the optimum, the one nearer 
the optimum gives the better functional value (i.e., the smaller value in the case of a 
minimization problem). This can be stated mathematically as follows: 

A function f (x. ) is unimodal if (/) x\ < X2 < x* implies that f(x 2) < 
f(x 1), and (ii) X2 > x\ > x* implies that f (x 1 ) < f(xf), where x* is the 
minimum point. 

Some examples of unimodal functions are shown in Fig. 5 . 4 . Thus a unimodal function 
can be a nondifferentiable or even a discontinuous function. If a function is known to 
be unimodal in a given range, the interval in which the minimum lies can be narrowed 
down provided that the function values are known at two different points in the range. 

f(x) f(x) fix) 
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Figure 5.5 Outcome of first two experiments: (a) f\ < /Y ( b ) f\ > yj; (c) f\ = Z). 

For example, consider the normalized interval [0, 1] and two function evaluations 
within the interval as shown in Fig. 5.5. There are three possible outcomes, namely, 
/t < fi, fi > fi, or /i = / 2 . If the outcome is that f\ < /T the minimizing x cannot 
lie to the right of xi. Thus that part of the interval [X 2 , 1] can be discarded and a new 
smaller interval of uncertainty, [0, xt\, results as shown in Fig. 5.5a. If f(x i) > /'(xi), 
the interval [0, x\] can be discarded to obtain a new smaller interval of uncertainty, 
[xi, 1] (Fig. 5.5 b), while if f(x\) = /(x 2 ), intervals [0, xi] and [X 2 , 1] can both be 
discarded to obtain the new interval of uncertainty as [xi,X 2 ] (Fig. 5.5c). Further, 
if one of the original experiments 1 ” remains within the new interval, as will be the 
situation in Fig. 5.5a and b, only one other experiment need be placed within the new 
interval in order that the process be repeated. In situations such as Fig. 5.5c, two more 
experiments are to be placed in the new interval in order to find a reduced interval of 
uncertainty. 

The assumption of unimodality is made in all the elimination techniques. If a 
function is known to be multimodal (i.e., having several valleys or peaks), the range of 
the function can be subdivided into several parts and the function treated as a unimodal 
function in each part. 


Elimination Methods 


5.3 UNRESTRICTED SEARCH 

In most practical problems, the optimum solution is known to lie within restricted 
ranges of the design variables. In some cases this range is not known, and hence the 
search has to be made with no restrictions on the values of the variables. 

5.3.1 Search with Fixed Step Size 

The most elementary approach for such a problem is to use a fixed step size and move 
from an initial guess point in a favorable direction (positive or negative). The step size 


+ Each function evaluation is termed as an experiment or a trial in the elimination methods. 
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used must be small in relation to the final accuracy desired. Although this method is 
very simple to implement, it is not efficient in many cases. This method is described 
in the following steps: 

1. Start with an initial guess point, say, x\. 

2. Find f, = f(x{). 

3. Assuming a step size s, find x 2 = x\ + s. 

4. Find f 2 = f(x 2 ). 

5. If f 2 < fi, and if the problem is one of minimization, the assumption of uni- 
modality indicates that the desired minimum cannot lie at x < x\. Hence the 
search can be continued further along points X 3 , X 4 , . . . using the unimodality 
assumption while testing each pair of experiments. This procedure is con- 
tinued until a point, x,- — xi+ (i — I )s, shows an increase in the function 
value. 

6. The search is terminated at x ; -, and either x, _i or x, can be taken as the optimum 
point. 

7. Originally, if f 2 > f\, the search should be carried in the reverse direction at 
points x_ 2 , X— 3 , . . . , where x_ ; - = x\ — ( j — l)i. 

8. If f 2 — /] , the desired minimum lies in between x\ and x 2 , and the minimum 
point can be taken as either x\ or x 2 . 

9. If it happens that both f 2 and /_ 2 are greater than /j, it implies that the desired 
minimum will lie in the double interval x- 2 < x < x 2 . 


5.3.2 Search with Accelerated Step Size 

Although the search with a fixed step size appears to be very simple, its major limitation 
comes because of the unrestricted nature of the region in which the minimum can lie. 
For example, if the minimum point for a particular function happens to be x opt = 
50, 000 and, in the absence of knowledge about the location of the minimum, if x\ and 
.y are chosen as 0.0 and 0.1, respectively, we have to evaluate the function 5,000,001 
times to find the minimum point. This involves a large amount of computational work. 
An obvious improvement can be achieved by increasing the step size gradually until 
the minimum point is bracketed. A simple method consists of doubling the step size 
as long as the move results in an improvement of the objective function. Several other 
improvements of this method can be developed. One possibility is to reduce the step 
length after bracketing the optimum in (x,_i,x,). By starting either from x,_ 1 or jc,-, 
the basic procedure can be applied with a reduced step size. This procedure can be 
repeated until the bracketed interval becomes sufficiently small. The following example 
illustrates the search method with accelerated step size. 

Example 5.3 Find the minimum of / = x(x — 1.5) by starting from 0.0 with an initial 
step size of 0.05. 

SOLUTION The function value at x\ is /j = 0.0. If we try to start moving in the 
negative x direction, we find that x_ 2 = —0.05 and /_ 2 — 0.0775. Since /_ 2 > /j, the 
assumption of unimodality indicates that the minimum cannot lie toward the left of 
x~ 2 . Thus we start moving in the positive x direction and obtain the following results: 
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i 

Value of s 

Xi = X\ + s 

ft = /(*<) 

Is /<>//-!? 

1 

— 

0.0 

0.0 

— 

2 

0.05 

0.05 

-0.0725 

No 

3 

0.10 

0.10 

-0.140 

No 

4 

0.20 

0.20 

-0.260 

No 

5 

0.40 

0.40 

-0.440 

No 

6 

0.80 

0.80 

-0.560 

No 

7 

1.60 

1.60 

+0.160 

Yes 


From these results, the optimum point can be seen to be x op t ~ = 0.8. In this case, 

the points x& and xj do not really bracket the minimum point but provide information 
about it. If a better approximation to the minimum is desired, the procedure can be 
restarted from x 5 with a smaller step size. 


5.4 EXHAUSTIVE SEARCH 

The exhaustive search method can be used to solve problems where the interval in 
which the optimum is known to lie is finite. Let x s and x/ denote, respectively, the 
starting and final points of the interval of uncertainty. * The exhaustive search method 
consists of evaluating the objective function at a predetermined number of equally 
spaced points in the interval (x s ,Xf), and reducing the interval of uncertainty using the 
assumption of unimodality. Suppose that a function is defined on the interval (x s , x f) 
and let it be evaluated at eight equally spaced interior points x\ to x&. Assuming that 
the function values appear as shown in Fig. 5.6, the minimum point must lie, according 
to the assumption of unimodality, between points X 5 and xj. Thus the interval (* 5 , xj) 
can be considered as the final interval of uncertainty. 

In general, if the function is evaluated at n equally spaced points in the original 
interval of uncertainty of length Lq — x j — x s , and if the optimum value of the function 
(among the 11 function values) turns out to be at point xj, the final interval of uncertainty 



tSince the interval ( x s , Xf), but not the exact location of the optimum in this interval, is known to us, the 
interval (x s ,xf) is called the interval of uncertainty . 
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is given by 

2 

L n — ^j + 1 Xj — i — ■ -Lq (5.2) 

n + 1 

The final interval of uncertainty obtainable for different number of trials in the exhaus- 
tive search method is given below: 

Number of trials 2 3 4 5 6 n 

LJL 0 2/3 2/4 2/5 2/6 2/7 ... 2/(n + 1) 

Since the function is evaluated at all n points simultaneously, this method can be called 
a simultaneous search method. This method is relatively inefficient compared to the 
sequential search methods discussed next, where the information gained from the initial 
trials is used in placing the subsequent experiments. 

Example 5.4 Find the minimum of / = x(x — 1.5) in the interval (0.0, 1.00) to within 
10% of the exact value. 


SOLUTION If the middle point of the final interval of uncertainty is taken as the 
approximate optimum point, the maximum deviation could be \/(n + 1) times the 
initial interval of uncertainty. Thus to find the optimum within 10% of the exact value, 
we should have 

1 1 


n + 1 - 10 


or n >9 


By taking n — 9, the following function values can be calculated: 


123456789 
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 


fi = f ( Xi ) -0.14 -0.26 -0.36 -0.44 -0.50 -0.54 -0.56 -0.56 -0.54 


Since x 7 = xg, the assumption of unimodality gives the final interval of uncertainty as 
L 9 = (0.7, 0.8). By taking the middle point of Lg (i.e., 0.75) as an approximation to 
the optimum point, we find that it is, in fact, the true optimum point. 


5.5 DICHOTOMOUS SEARCH 

The exhaustive search method is a simultaneous search method in which all the exper- 
iments are conducted before any judgment is made regarding the location of the 
optimum point. The dichotomous search method, as well as the Fibonacci and the 
golden section methods discussed in subsequent sections, are sequential search meth- 
ods in which the result of any experiment influences the location of the subsequent 
experiment. 

In the dichotomous search, two experiments are placed as close as possible at 
the center of the interval of uncertainty. Based on the relative values of the objective 
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Figure 5.7 Dichotomous search. 


-• 

x f 


function at the two points, almost half of the interval of uncertainty is eliminated. Let 
the positions of the two experiments be given by (Fig. 5.7) 


L 0 8 



where 8 is a small positive number chosen so that the two experiments give significantly 
different results. Then the new interval of uncertainty is given by ( Lq/2 + 8/2). The 
building block of dichotomous search consists of conducting a pair of experiments 
at the center of the current interval of uncertainty. The next pair of experiments is, 
therefore, conducted at the center of the remaining interval of uncertainty. This results 
in the reduction of the interval of uncertainty by nearly a factor of 2. The intervals 
of uncertainty at the end of different pairs of experiments are given in the following 
table: 

Number of experiments 2 4 6 

1 1 /L 0 + 8\ 8 1 / L 0 + 8 8 

Final interval of uncertainty -{L o + <5) - I — - — 1 + — | — - h - 

In general, the final interval of uncertainty after conducting n experiments (n even) is 
given by 

L - = ^n +i ('-^n) < 5 - 3 > 

The following example is given to illustrate the method of search. 

Example 5.5 Find the minimum of / = x(x — 1.5) in the interval (0.0, 1.00) to within 
10% of the exact value. 

SOLUTION The ratio of final to initial intervals of uncertainty is given by [from 
Eq. (5.3)] 
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where 8 is a small quantity, say 0.001, and n is the number of experiments. If the 
middle point of the final interval is taken as the optimum point, the requirement can 
be stated as 

}_Ln < _L 

2 Lq ~ 10 

2«/2 ^ Lq \ 2»/2 ) - 5 

Since 8 — 0.001 and Lq = 1.0, we have 

J_ + J_ fl _ < I 

2»/ 2 1000 V 2"/ 2 / ~ 5 

999 1 995 „„/2 999 c n 

-r < or 2 ' 2 > ~ 5.0 

1000 2"/2 - 5000 ~ 199 

Since n has to be even, this inequality gives the minimum admissible value of n as 6. 
The search is made as follows. The first two experiments are made at 

x i = — - - = 0.5 - 0.0005 = 0.4995 
2 2 

x 2 = — + - = 0.5 + 0.0005 = 0.5005 
2 2 

with the function values given by 

fi = f(xi) = 0.4995 (-1.0005) ~ -0.49975 
f 2 = f(x2) = 0.5005 (-0.9995) ~ -0.50025 

Since f 2 < f\, the new interval of uncertainty will be (0.4995, 1.0). The second pair 
of experiments is conducted at 

/ 1.0- 0.4995 \ 

x 3 = I 0.4995 + J - 0.0005 = 0.74925 

/ 1.0- 0.4995 \ 

x 4 = I 0.4995 + J + 0.0005 = 0.75025 

which give the function values as 

f 3 = f( X3 ) = 0.74925 (—0.75075) = -0.5624994375 
/ 4 = f( X4 ) = 0.75025 (—0.74975) = -0.5624999375 

Since f 3 > / 4 , we delete (0.4995, x 3 ) and obtain the new interval of uncertainty as 


(x 3 , 1.0) = (0.74925, 1.0) 
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The final set of experiments will be conducted at 
( 1.0- 0.74925 \ 

*5 = (0.74925 + J - 0.0005 = 0.874125 

x 6 = (0.74925 + J +0.0005 = 0.875125 

The corresponding function values are 

f 5 = /(x 5 ) = 0.8741251-0.625875) = -0.5470929844 

f 6 = /(x 6 ) = 0.8751251-0.624875) = -0.5468437342 

Since < fo, the new interval of uncertainty is given by (xj, xf) — (0.74925, 

0.875125). The middle point of this interval can be taken as optimum, and hence 

x opt ~ 0.8121875 and / opt ~ -0.5586327148 


5.6 INTERVAL HALVING METHOD 

In the internal halving method, exactly one-half of the current interval of uncertainty 
is deleted in every stage. It requires three experiments in the first stage and two exper- 
iments in each subsequent stage. The procedure can be described by the following 
steps: 

1. Divide the initial interval of uncertainty Lq — [a, b] into four equal parts and 
label the middle point xo and the quarter-interval points xi and x 2 . 

2. Evaluate the function /(x) at the three interior points to obtain f\ — f(x i), 
/o = fix o), and f 2 = f(x 2 ). 

3. (a) If f 2 > fo > fi as shown in Fig. 5.8a, delete the interval (xo, b ), label xi 

and xo as the new xo and b, respectively, and go to step 4. 

(b) If f 2 < fo < /i as shown in Fig. 5.86, delete the interval (a,x o), label x 2 
and xo as the new xq and a, respectively, and go to step 4. 

(c) If /i > fo and f 2 > fo as shown in Fig. 5.8c, delete both the intervals 
(a, xi) and (x 2 , b), label x\ and x 2 as the new a and b, respectively, and 
go to step 4. 

4. Test whether the new interval of uncertainty, L — b — a, satisfies the conver- 
gence criterion L < e, where e is a small quantity. If the convergence criterion 
is satisfied, stop the procedure. Otherwise, set the new Lq — L and go to step 1. 

Remarks: 

1. In this method, the function value at the middle point of the interval of uncer- 
tainty, fo, will be available in all the stages except the first stage. 

2. The interval of uncertainty remaining at the end of n experiments (n > 3 and 
odd) is given by 

/ 1 \ («— 1)/2 

L « = ( 2 J L o (5-4) 
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Figure 5.8 Possibilities in the interval halving method: (a) fi> fo> f\', (b) f\ > fo> f 2 ', 
(c) /l > fo and f 2 > fo- 
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Example 5.6 Find the minimum of / = x(x — 1.5) in the interval (0.0, 1.0) to within 
10% of the exact value. 


SOLUTION If the middle point of the final interval of uncertainty is taken as the 
optimum point, the specified accuracy can be achieved if 


I L. < — 
2 10 


or 


j \ (« — 1)/2 


L 0 < 


Since L 0 = 1, Eq. (Ei) gives 


1 ^ 1 

2(«— 1)/2 — 5 


or 2 (n_1)/2 > 5 


(Ei) 

(E 2 ) 


Since n has to be odd, inequality (E 2 ) gives the minimum permissible value of n as 7. 
With this value of n — 7, the search is conducted as follows. The first three experiments 
are placed at one-fourth points of the interval L$ = [a — 0, b = 1] as 


X] = 0.25, /i = 0.25(— 1.25) = -0.3125 

x Q = 0.50, f 0 = 0.50(— 1.00) = -0.5000 

x 2 = 0.75, f 2 = 0.75(— 0.75) = -0.5625 


Since /i > /o > fi, we delete the interval (a, xo) = (0.0, 0.5), label x 2 and xo as the 
new xq and a so that a — 0.5, xq — 0.75, and b = 1.0. By dividing the new interval of 
uncertainty, L3 = (0.5, 1.0) into four equal parts, we obtain 

xi = 0.625, fi = 0.625(— 0.875) = -0.546875 

x 0 = 0.750, f 0 = 0.750(— 0.750) = -0.562500 

x 2 = 0.875, f 2 = 0.875(— 0.625) = -0.546875 


Since /1 > /o and / 2 > /o, we delete both the intervals ( a , xO and (x 2 , b), and label 
xi, xo, and x 2 as the new a, xo, and b, respectively. Thus the new interval of uncer- 
tainty will be L5 = (0.625, 0.875). Next, this interval is divided into four equal parts 
to obtain 


xj = 0.6875, /1 = 0.6875(— 0.8125) = -0.558594 

x 0 = 0.75, f 0 = 0.75(— 0.75) = -0.5625 

x 2 = 0.8125, f 2 = 0.8125(— 0.6875) = -0.558594 

Again we note that f\ > /o and f 2 > fo and hence we delete both the intervals (a, x\) 
and (x 2 ,b) to obtain the new interval of uncertainty as L 2 = (0.6875, 0.8125). By 
taking the middle point of this interval (L7) as optimum, we obtain 

x op t ~ 0.75 and / opt « —0.5625 

(This solution happens to be the exact solution in this case.) 
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5.7 FIBONACCI METHOD 

As stated earlier, the Fibonacci method can be used to find the minimum of a function 
of one variable even if the function is not continuous. This method, like many other 
elimination methods, has the following limitations: 

1. The initial interval of uncertainty, in which the optimum lies, has to be known. 

2. The function being optimized has to be unimodal in the initial interval of uncer- 
tainty. 

3. The exact optimum cannot be located in this method. Only an interval known as 
the final interval of uncertainty will be known. The final interval of uncertainty 
can be made as small as desired by using more computations. 

4. The number of function evaluations to be used in the search or the resolution 
required has to be specified beforehand. 

This method makes use of the sequence of Fibonacci numbers, {T„}, for placing the 
experiments. These numbers are defined as 


F 0 = F l = l 

Fn = F n _i + F„_ 2 , n =2,3,4,... 
which yield the sequence 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 


Procedure. Let To be the initial interval of uncertainty defined by a < x < b and n 
be the total number of experiments to be conducted. Define 


L 


* 

2 — 



(5.5) 


and place the first two experiments at points x\ and x 2 , which are located at a distance 
of L* from each end of To.’ 1 This gives* 


x\ — a + L* — a + 
x 2 = b - L* 2 = b - 


Fn - 2 
Fn 

F n - 2 
~F, 7 


Lq 


Lq — o + 



(5.6) 


Discard part of the interval by using the unimodality assumption. Then there remains 
a smaller interval of uncertainty L 2 given by s 


L 2 = L 0 - L* 




(5.7) 


Tf an experiment is located at a distance of (F„_ 2 /F„)Fo from one end, it will be at a distance of 
(F„_i/F„)Lo from the other end. Thus L\ = (F„_i/F„)Lo will yield the same result as with L* = 
(F„_ 2 /F„)L 0 . 

Tt can be seen that 

L\ = for n >2 

bn ^ 

§ The symbol Lj is used to denote the interval of uncertainty remaining after conducting j experiments, 
while the symbol L* is used to define the position of the y'th experiment. 
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and with one experiment left in it. This experiment will be at a distance of 


L 


* 

2 



F n - 2 
F n - 1 


L2 


from one end and 


Li-L* 2 



Fn - 3 
F n - 1 


L 2 


(5.8) 


(5.9) 


from the other end. Now place the third experiment in the interval Li so that the current 
two experiments are located at a distance of 


L 


* 

3 



F „- 3 
F n - 1 


^2 


(5.10) 


from each end of the interval Li. Again the unimodality property will allow us to 
reduce the interval of uncertainty to L 3 given by 


j T * j F n — 3 F n - 2 F n —2 T 

= i -2 - ^2 = — — — — r -0 


F n - 1 


Fn—l 


F„ 


(5.11) 


This process of discarding a certain interval and placing a new experiment in the 
remaining interval can be continued, so that the location of the yth experiment and the 
interval of uncertainty at the end of j experiments are, respectively, given by 


L* = Fn ~ j L. , 
7 Fn-U-2) J 

(5.12) 

II 

^ c 
r 

o 

(5.13) 

The ratio of the interval of uncertainty remaining after conducting j of the n 
mined experiments to the initial interval of uncertainty becomes 

predeter- 

L i F n - 0 -D 

Lq F n 

(5.14) 

and for j — n, we obtain 


t~i 1 t~-< 

O | 3 

II 

II 

^ 1 ^ 

(5.15) 


The ratio L„/Lo will permit us to determine n, the required number of experiments, 
to achieve any desired accuracy in locating the optimum point. Table 5.2 gives 
the reduction ratio in the interval of uncertainty obtainable for different number of 
experiments. 
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Table 5.2 Reduction Ratios 


Value of n 

Fibonacci number, F n 

Reduction ratio, L„/Lq 

0 

1 

1.0 

1 

1 

1.0 

2 

2 

0.5 

3 

3 

0.3333 

4 

5 

0.2 

5 

8 

0.1250 

6 

13 

0.07692 

7 

21 

0.04762 

8 

34 

0.02941 

9 

55 

0.01818 

10 

89 

0.01124 

11 

144 

0.006944 

12 

233 

0.004292 

13 

377 

0.002653 

14 

610 

0.001639 

15 

987 

0.001013 

16 

1,597 

0.0006406 

17 

2,584 

0.0003870 

18 

4,181 

0.0002392 

19 

6,765 

0.0001479 

20 

10,946 

0.00009135 


Position of the Final Experiment. In this method the last experiment has to be 
placed with some care. Equation (5.12) gives 

K F 0 1 f 

= — = - for all n (5.16) 

Fn—i Fo 2 

Thus after conducting n — 1 experiments and discarding the appropriate interval in each 
step, the remaining interval will contain one experiment precisely at its middle point. 
However, the final experiment, namely, the nth experiment, is also to be placed at the 
center of the present interval of uncertainty. That is, the position of the /tth experiment 
will be same as that of (n — l)th one, and this is true for whatever value we choose 
for n. Since no new information can be gained by placing the nth experiment exactly 

at the same location as that of the (n — l)th experiment, we place the nth experi- 

ment very close to the remaining valid experiment, as in the case of the dichotomous 
search method. This enables us to obtain the final interval of uncertainty to within 
i. A flowchart for implementing the Fibonacci method of minimization is given 
in Fig. 5.9. 

Example 5.7 Minimize f(x) — 0.65 — [0. 75/(1 + x 2 )] — 0.65x tan '(l/.r) in the 
interval [0,3] by the Fibonacci method using n = 6. (Note that this objective is 
equivalent to the one stated in Example 5.2.) 
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Figure 5.9 Flowchart for implementing Fibonacci search method. 
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SOLUTION Here n —6 and Lq = 3.0, which yield 


L* = = —(3.0) = 1.153846 

2 F„ 13 


Thus the positions of the first two experiments are given by x\ = 1.153846 and 
x 2 = 3.0 - 1.153846 = 1.846154 with fa = f{x i) = -0.207270 and fa = f(x 2 ) = 
—0.115843. Since f\ is less than f 2 , we can delete the interval [x 2 , 3.0] by using 
the unimodality assumption (Fig. 5.10a). The third experiment is placed at X3 = 0 + 
(x 2 — x x ) — 1.846154 — 1.153846 = 0.692308, with the corresponding function value 
of 7s = -0.291364. 

Since f\ > fa, we delete the interval [x 1 , .*2 1 (Fig. 5. 1 Oh). The next experiment 
is located at *4 = 0 + (jci — X3) = 1.153846 — 0.692308 = 0.461538 with fa = 
—0.309811. Nothing that fa < fa, we delete the interval [JC3, jci] (Fig. 5.10c). The 
location of the next experiment can be obtained as x$ = 0 + (x 2 — X4) = 0.692308 — 
0.461538 = 0.230770 with the corresponding objective function value of fa — 
—0.263678. Since fa > fa, we delete the interval [0, X5] (Fig. 5. 1 Or/). The final exper- 
iment is positioned at xg = x=, + (JC3 — X4) = 0.230770 + (0.692308 — 0.461538) = 
0.461540 with fa = —0.309810. (Note that, theoretically, the value of xg should be 
same as that of X4; however, it is slightly different from X4, due to round-off error). 

Since fa > fa, we delete the interval [xg, X3] and obtain the final interval of uncer- 
tainty as Lg = [x5,xg] = [0.230770, 0.461540] (Fig. 5.10e). The ratio of the final to 
the initial interval of uncertainty is 


This value can be compared with Eq. (5.15), which states that if n experiments ( n — 6) 
are planned, a resolution no finer than \/F n = 1/Fg = ^ = 0.076923 can be expected 
from the method. 


The golden section method is same as the Fibonacci method except that in the Fibonacci 
method the total number of experiments to be conducted has to be specified before 
beginning the calculation, whereas this is not required in the golden section method. 
In the Fibonacci method, the location of the first two experiments is determined by 
the total number of experiments, N. In the golden section method we start with the 
assumption that we are going to conduct a large number of experiments. Of course, 
the total number of experiments can be decided during the computation. 

The intervals of uncertainty remaining at the end of different number of experiments 
can be computed as follows: 


u 

Lo 


0.461540-0.230770 

TO 


= 0.076923 


5.8 GOLDEN SECTION METHOD 



(5.17) 




2 


(5.18) 
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fix) 


l 2 H 

XI = 1.153846 * 2 = 1.846154 3 


xxxxxxxx 


A = -0.1 15843 


A =-0.207270 


fix) 


l 3 - 


*3 = 0.692308 xi = 1.153846 X2 = 1.846154 


® **■ X 

xxxxxxxxx 


A = -0.207270 


A = -0.291364 


A*) 


l 4 — 

x 4 = 0.461538 *3 = 0.692308 


x x x x x x x 

xi = 1.153846 


A = -0.291364 
A = -0.309811 


Figure 5.10 Graphical representation of the solution of Example 5.7. 
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f(x) 


f* L 5 »| 

*5 = 0.230770 *4 = 0.461538 *3 = 0.692308 


x x xx x xx x 


f(x) 


fy = -0.263678 

A = -0.309811 


Le 


7* 


*4 = 0.461538 

X *6 = 0.461540 *3 = 0.692308 


*5 = 0.230770 


xxxxxxxxx 


fe = -0.309810 

A = -0.309811 

Figure 5.10 ( continued ) 


(d) 


(e) 


This result can be generalized to obtain 


k - 1 


L k = lirn ( ) L 0 

N^-oo y rjy / 

Using the relation 

^/V = Av-I + Av-2 

we obtain, after dividing both sides by F ;V _ i , 

Av - i + Fn ~ 2 


F N - i 


By defining a ratio y as 


y — lim 


n N-l 


"N 


N->oo Fn~ i 


(5.19) 

(5.20) 

(5.21) 


(5.22) 
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Eq. (5.21) can be expressed as 

1 

y — — b i 
y 


that is, 


y 2 — y — l — 0 


This gives the root y = 1.618, and hence Eq. (5.19) yields 


L k 


/ i \ k ~ l 

f-J L 0 = (0.618)*-% 


(5.23) 


(5.24) 


In Eq. (5.18) the ratios F n ~ 2 / Fn-i and F N _\/F N have been taken to be same 
for large values of N . The validity of this assumption can be seen from the following 
table: 


Value of V 2 3 4 5 6 7 8 9 10 oo 

Ratio 0.5 0.667 0.6 0.625 0.6156 0.619 0.6177 0.6181 0.6184 0.618 

F n 


The ratio y has a historical background. Ancient Greek architects believed that a 
building having the sides d and b satisfying the relation 


d + b 
d 



(5.25) 


would have the most pleasing properties (Fig. 5.11). The origin of the name, golden 
section method , can also be traced to the Euclid’s geometry. In Euclid’s geometry, 
when a line segment is divided into two unequal parts so that the ratio of the whole to 
the larger part is equal to the ratio of the larger to the smaller, the division is called 
the golden section and the ratio is called the golden mean. 


Procedure. The procedure is same as the Fibonacci method except that the location 
of the first two experiments is defined by 

Fj\l—? Fn—2 Fn-\ F 0 

L * = -%^L 0 = -tt— ~ tt— T 0 = -l= 0.382LO (5.26) 

Fn F N - 1 F n y l 

The desired accuracy can be specified to stop the procedure. 



d 


Figure 5.11 Rectangular building of sides b and d. 
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Example 5.8 Minimize the function 

f(x) = 0.65 - [0. 75/(1 + x 2 )] - 0.65* tan " 1 (1/x) 
using the golden section method with n — 6. 

SOLUTION The locations of the first two experiments are defined by L* — 
0.382 L 0 = (0.382)(3.0) = 1.1460. Thus*! = 1.1460 and * 2 = 3.0 - 1.1460 = 1.8540 
with f\ — fix i) = —0.208654 and / 2 = /(x 2 ) = —0.115124. Since /i < / 2 , we 
delete the interval [x 2 , 3.0] based on the assumption of unimodality and obtain the new 
interval of uncertainty as L 2 = [0, x 2 ] = [0.0, 1.8540]. The third experiment is placed 
at *3 = 0 + (x 2 — *i) = 1.8540 — 1.1460 = 0.7080. Since fa — —0.288943 is smaller 
than /i = —0.208654, we delete the interval [*i,* 2 ] and obtain the new interval of 
uncertainty as [0.0, *i] = [0.0, 1.1460]. The position of the next experiment is given 
by * 4 = 0 + (*i - * 3 ) = 1.1460 - 0.7080 = 0.4380 with f 4 = -0.308951. 

Since f 4 < fo, we delete [* 3 , *i] and obtain the new interval of uncertainty as [0, 
* 3 ] = [0.0, 0.7080]. The next experiment is placed at *5 = 0 + (*3 — * 4 ) = 0.7080 — 
0.4380 = 0.2700. Since /s = —0.278434 is larger than f 4 = —0.308951, we delete the 
interval [0, * 5 ] and obtain the new interval of uncertainty as [* 5 , * 3 ] = [0.2700, 0.7080]. 
The final experiment is placed at * 6 = * 5 + (* 3 — * 4 ) = 0.2700 + (0.7080 — 0.4380) = 
0.5400 with /6 = —0.308234. Since fa > / 4 , we delete the interval [*g, * 3 ] and obtain 
the final interval of uncertainty as [* 5 , * 6 ] = [0.2700, 0.5400]. Note that this final 
interval of uncertainty is slightly larger than the one found in the Fibonacci method, 
[0.461540, 0.230770]. The ratio of the final to the initial interval of uncertainty in the 
present case is 

L 6 0.5400 - 0.2700 0.27 

— = = = 0.09 

L 0 3.0 3.0 

5.9 COMPARISON OF ELIMINATION METHODS 

The efficiency of an elimination method can be measured in terms of the ratio of the 
final and the initial intervals of uncertainty, L h /Lq. The values of this ratio achieved 
in various methods for a specified number of experiments in = 5 and n — 1 0) are 
compared in Table 5.3. It can be seen that the Fibonacci method is the most effi- 
cient method, followed by the golden section method, in reducing the interval of 
uncertainty. 

A similar observation can be made by considering the number of experiments (or 
function evaluations) needed to achieve a specified accuracy in various methods. The 
results are compared in Table 5.4 for maximum permissible errors of 0.1 and 0.01. It 
can be seen that to achieve any specified accuracy, the Fibonacci method requires the 
least number of experiments, followed by the golden section method. 


Interpolation Methods 

The interpolation methods were originally developed as one-dimensional searches 
within multivariable optimization techniques, and are generally more efficient than 
Fibonacci-type approaches. The aim of all the one-dimensional minimization methods 
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Table 5.3 Final Intervals of Uncertainty 


Method 


Formula 

n = 5 

n = 10 

Exhaustive search 

L„ 

II 

+ to 

O 

0.33333L0 

0.18182L 0 

Dichotomous search 
( S = 0.01 and 

Ln 

(1-^72) 

\L 0 + 0.0075 with 
n = 4, iL 0 + 0.00875 

0.03125L 0 + 0.0096875 

n = even) 



with n = 6 


Interval halving (w > 3 

L„ 

= (i) (n - |)/2 Lo 

0.25Lo 

0.0625Lo with n = 9, 

and odd) 


1 

— — L 0 
F 

1 n 


0.03125Lo with 

n — It 

Fibonacci 

Ln 

0.125L 0 

0.01124L 0 

Golden section 

Ln 

= (0.618)" _1 L 0 

0.1459L 0 

0.01315L 0 


Table 5.4 Number of Experiments for a Specified Accuracy 


Method Error: < 0.1 Error: < 0.01 

2 Lq 2 Lq 


Exhaustive search 

n 

> 

9 

n 

> 

99 

Dichotomous search (5 = 0.01. Lq = 1) 

n 

> 

6 

n 

> 

14 

Interval halving (n > 3 and odd) 

n 

> 

7 

n 

> 

13 

Fibonacci 

n 

> 

4 

n 

> 

9 

Golden section 

n 

> 

5 

n 

> 

10 


is to find A*, the smallest nonnegative value of A, for which the function 

/(A) = /(X+AS) (5.27) 

attains a local minimum. Hence if the original function /(X) is expressible as an explicit 
function of x, (i — 1,2, , n ), we can readily write the expression for /(A) = /(X 
+ AS) for any specified vector S, set 

df 

77 (A) = 0 (5.28) 

dk 

and solve Eq. (5.28) to find A* in terms of X and S. However, in many practical 
problems, the function /(A) cannot be expressed explicitly in terms of A (as shown in 
Example 5.1). In such cases the interpolation methods can be used to find the value 
of A*. 

Example 5.9 Derive the one-dimensional minimization problem for the following 
case: 

Minimize /(X) = (x 2 — xi) 2 + (1 — x\) 2 (Ei) 

from the starting point Xi = { 2 } along the search direction S = {,' 15 }- 


5.10 Quadratic Interpolation Method 273 


SOLUTION The new design point X can be expressed as 


X = 



= X! + XS 



By substituting x\ — —2 + 7. and X 2 — —2 + 0.251 in Eq. (EQ, we obtain / as a 
function of X as 


/(*) = / (_ 2 l J _ 25 a ) = [( ~ 1 2 + A ) 2 - (- 2 + °' 25A )] 2 

+ [1 - (-2 + l)] 2 = X 4 - 8.51 3 + 3 1.0625 1 2 - 57.01 + 45.0 


The value of 1 at which /(l) attains a minimum gives 1*. 


In the following sections, we discuss three different interpolation methods with 
reference to one-dimensional minimization problems that arise during multivariable 
optimization problems. 


5.10 QUADRATIC INTERPOLATION METHOD 

The quadratic interpolation method uses the function values only; hence it is useful 
to find the minimizing step (1*) of functions /(X) for which the partial derivatives 
with respect to the variables x-, are not available or difficult to compute [5.2, 5.5]. 
This method finds the minimizing step length X* in three stages. In the first stage the 
S-vector is normalized so that a step length of 1 = 1 is acceptable. In the second stage 
the function /(l) is approximated by a quadratic function h(X) and the minimum, 1*, 
of /t(l) is found. If 1* is not sufficiently close to the true minimum X*, a third stage is 
used. In this stage a new quadratic function (refit) h'(X) — a' + b'X + c'X 2 is used to 
approximate /(l), and a new value of X* is found. This procedure is continued until 
a X* that is sufficiently close to X* is found. 

Stage 1. In this staged the S vector is normalized as follows: Find A = max|,v ( - 1, 
where is the / th component of S and divide each component of S by A. Another 
method of normalization is to find A = (s 2 + + ■ ■ ■ + .v 2 ) 1 2 and divide each com- 

ponent of S by A. 


Stage 2. Let 


h{ X) — a + bX + cX 2 


(5.29) 


be the quadratic function used for approximating the function f{X). It is worth noting 
at this point that a quadratic is the lowest-order polynomial for which a finite minimum 
can exist. The necessary condition for the minimum of h{X) is that 

dh 

— — b + 2 cX — 0 
dX 


1 This stage is not required if the one-dimensional minimization problem has not arisen within a multivariable 

minimization problem. 
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that is, 



The sufficiency condition for the minimum of /?(/,) is that 


that is, 


d 2 h 

dX 2 



(5.30) 


c > 0 


(5.31) 


To evaluate the constants a, b, and c in Eq. (5.29), we need to evaluate the function 
f(X) at three points. Let X = A, X = B, and X = C be the points at which the function 
f(X) is evaluated and let f A , fis , and fc be the corresponding function values, that is, 

f A — ci bA T cA~ 

f B — ci T- bB T cB~ 

fc — a + bC + cC 2 (5.32) 


The solution of Eqs. (5.32) gives 

f A BC(C - B) + f B CA(A-C) + f c AB(B -A) 
a ~ (A — B)(B — C)(C — A) 

b Ja(B 2 - C 2 ) + f B (C 2 - A 2 ) + f c (A 2 - B 2 ) 

(A - B)(B - C)(C - A) 


(5.33) 

(5.34) 


f A (B-C) + f B (C-A) + fc(A-B) 
C (A — B)(B — C)(C — A) 


(5.35) 


From Eqs. (5.30), (5.34), and (5.35), the minimum of h(X) can be obtained as 

= = f A {B 2 - C 2 ) + f B (C 2 - A 2 ) + f c (A 2 - B 2 ) 

2c 2 [f A (B-C) + MC-A) + f c (A-B)] ' J 

provided that c, as given by Eq. (5.35), is positive. 

To start with, for simplicity, the points A, B, and C can be chosen as 0, t, and 2 1, 
respectively, where t is a preselected trial step length. By this procedure, we can save 
one function evaluation since f A — f(X — 0) is generally known from the previous 
iteration (of a multivariable search). For this case, Eqs. (5.33) to (5.36) reduce to 


a = f a 

(5.37) 

, 4 f B - 3 f A - f c 

b = 

(5.38) 

2 1 

fc + f A - 2 f B 

C It 2 



_ 4/g - 3f A - f c 
~ 4 f B ~ 2 f c - 2 f A 


(5.40) 
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provided that 


fc + f A- 2 f B n 

C = >0 

2 1 2 


The inequality (5.41) can be satisfied if 

Ia + fc 


> f B 


(5.41) 


(5.42) 


(i.e., the function value f B should be smaller than the average value of [a and fc). 
This can be satisfied if f B lies below the line joining / \ and fc as shown in Fig. 5.12. 

The following procedure can be used not only to satisfy the inequality (5.42) but 
also to ensure that the minimum X* lies in the interval 0 < X* <2 t. 


1. Assuming that f A — f(X — 0) and the initial step size to are known, evaluate 
the function / at X — to and obtain f) — f(X = to)- The possible outcomes are 
shown in Fig. 5.13. 

2. If /i > [a is realized (Fig. 5.13c), set fc — f\ and evaluate the function / at 
X — to/2 and X* using Eq. (5.40) with t — to/2. 

3. If /i < f a is realized (Fig. 5.13 a or b), set f B — f\ , and evaluate the function / 
at X — 2/(| to find f 2 — f(X — 2/q). This may result in any one of the situations 
shown in Fig. 5.14. 

4. If f 2 turns out to be greater than f\ (Fig. 5.14 b or c), set fc = fi and compute 
X* according to Eq. (5.40) with t — to- 

5. If f 2 turns out to be smaller than f\, set new f\ = f 2 and to — 2to, and repeat 
steps 2 to 4 until we are able to find X*. 


Stage 3. The X* found in stage 2 is the minimum of the approximating quadratic 
h {X) and we have to make sure that this X* is sufficiently close to the true minimum X* 
of f(X) before taking X* — / * . Several tests are possible to ascertain this. One possible 
test is to compare f(X*) with h(X*) and consider X* a sufficiently good approximation 


/U) 



Figure 5.12 f B smaller than (/a + fc)/ 2. 
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f f 




f 



Figure 5.13 Possible outcomes when the function is evaluated at X = to'. (a) f\ < f A and 
t 0 < X*' ( b ) /i < /a and t 0 > X*; (c) f\ > f A and f 0 > X*. 



Figure 5.14 Possible outcomes when function is evaluated at X = to and 2to: (a) f 2 < f\ and 
fi < /a; ( b ) f 2 < /a and f 2 > /i; (c) f 2 > f A and f 2 > f\. 


if they differ not more than by a small amount. This criterion can be stated as 


h(X*) - f(X*) 

/(X*) 


(5.43) 


Another possible test is to examine whether df/dX is close to zero at a*. Since the 
derivatives of / are not used in this method, we can use a Unite-difference formula for 
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df/dX and use the criterion 

f(X* + AA*) - f(X* - AX*) 
2AX* 


(5.44) 


to stop the procedure. In Eqs. (5.43) and (5.44), £i and £2 are small numbers to be 
specified depending on the accuracy desired. 

If the convergence criteria stated in Eqs. (5.43) and (5.44) are not satisfied, a new 
quadratic function 

h'(X) = a' + b'X + c'X 1 

is used to approximate the function f(X). To evaluate the constants a' , //, and c', 
the three best function values of the current /a = f{X — 0), f B — f(X — to), fc — 
f(X = 2?o), and / = f(X = X*) are to be used. This process of trying to fit 
another polynomial to obtain a better approximation to X* is known as refitting the 
polynomial. 

For refitting the quadratic, we consider all possible situations and select the best 
three points of the present A, B, C, and X*. There are four possibilities, as shown 
in Fig. 5.15. The best three points to be used in refitting in each case are given in 
Table 5.5. A new value of X* is computed by using the general formula, Eq. (5.36). If 
this X* also does not satisfy the convergence criteria stated in Eqs. (5.43) and (5.44), 
a new quadratic has to be refitted according to the scheme outlined in Table 5.5. 


fM fix) 



Figure 5.15 Various possibilities for refitting. 
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Table 5.5 Refitting Scheme 




New points for refitting 

Case 

Characteristics 

New 

Old 

1 

A* > B 

A 

B 


f < fB 

B 

A* 



C 

C 



Neglect old A 


2 

A* > B 

A 

A 


f>fB 

B 

B 



C 

A* 



Neglect old C 


3 

l* < B 

A 

A 


f<fs 

B 

A* 



C 

B 



Neglect old C 


4 

A * < B 

A 

A* 


f>f B 

B 

B 



C 

C 



Neglect old A 



Example 5.10 Find the minimum of / = A 5 — 5):' — 20a + 5. 

SOLUTION Since this is not a multivariable optimization problem, we can proceed 
directly to stage 2. Let the initial step size be taken as to — 0.5 and A — 0. 

Iteration 1 

f A = /(A = 0) = 5 

/i = /(A = to) = 0.03125 - 5(0.125) - 20(0.5) + 5 = -5.59375 
Since f\ < f A , we set f B = f\ = —5.59375, and find that 

/2 = /(A = 2f 0 = 1.0) = —19.0 


As /*2 < f \ > we set new to — 1 and f\ = —19.0. Again we find that f\ < f A and hence 
set f B — f\ — —19.0, and find that /2 = /(A = 2to =2) = —43. Since /2 < f\, we 
again set to — 2 and f\ = —43. As this f\ < f A , set fn — f\ — —43 and evaluate 
f 2 — /(A = 2/o = 4) = 629. This time > fi and hence we set f c — f 2 — 629 and 
compute A* from Eq. (5.40) as 

4(-43) -3(5) - 629 1632 

A* = — — (2) = = 1.135 

4( — 43) - 2(629) - 2(5) 1440 

Convergence test : Since A = 0, f A = 5, B = 2, f B = —43, C — 4, and fc — 629, 
the values of a, b, and c can be found to be 


a = 5, b = -204, c = 90 


5.10 Quadratic Interpolation Method 279 


and 

h(i*) = *(1.135) = 5 - 204(1.135) + 90(1. 135) 2 = -1 10.9 

Since 

/ = /(!*) = (1.135) 5 - 5(1. 135) 3 - 20(1.135) + 5.0 = -23.127 

we have 


h(X*) - f(X*) 


-116.5+23.127 

fti*) 


-23.127 


As this quantity is very large, convergence is not achieved and hence we have to use 
refitting . 


Iteration 2 

Since X* < B and / > f B , we take the new values of A, B, and C as 

A — 1.135, / A = -23.127 

B = 2.0, f B = -43.0 

C = 4.0, f c = 629.0 

and compute new X*, using Eq. (5.36), as 

(—23.127X4.0 - 16.0) + (-43.0)(16.0 - 1.29) 

+ (629.0X1.29-4.0) 

v _ \ 661 

~ 2[(-23. 127)(2.0 - 4.0) + (-43.0)(4.0 - 1.135) ~~ 

+ (629.0X1.135 -2.0)] 

Convergence test : To test the convergence, we compute the coefficients of the 
quadratic as 

a = 288.0, b = -417.0, c = 125.3 


As 


h(X*) = h{ 1.661) = 288.0 - 417.0(1.661) + 125.3(1 .661) 2 = -59.7 
/ = f(x*) = 12.8 - 5(4.59) - 20(1.661) + 5.0 = -38.37 

we obtain 


h(X*) - f(X*) 


-59.70 + 38.37 

/(**) 


-38.37 


Since this quantity is not sufficiently small, we need to proceed to the next refit. 
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5.11 CUBIC INTERPOLATION METHOD 


The cubic interpolation method finds the minimizing step length A* in four stages [5.5, 
5.1 1]. It makes use of the derivative of the function /: 


The first stage normalizes the S vector so that a step size X = 1 is acceptable. The 
second stage establishes bounds on X*, and the third stage finds the value of X* by 
approximating /(A) by a cubic polynomial h(X). If the X* found in stage 3 does 
not satisfy the prescribed convergence criteria, the cubic polynomial is refitted in the 
fourth stage. 

Stage 1. Calculate A = max,- |.v, |, where [v, is the absolute value of the z'th compo- 
nent of S, and divide each component of S by A. An alternative method of normalization 
is to find 


and divide each component of S by A. 

Stage 2. To establish lower and upper bounds on the optimal step size A,*, we need 
to find two points A and B at which the slope elf MX has different signs. We know that 
at X — 0, 


since S is presumed to be a direction of descent. 1 ' 

Hence to start with we can take A — 0 and try to find a point X = B at which the 
slope df/dX is positive. Point B can be taken as the first value out of fo, 2to, 4fo, 8 /q, 
at which f is nonnegative, where to is a preassigned initial step size. It then follows 
that X* is bounded in the interval A < X* < B (Fig. 5.16). 

Stage 3. If the cubic equation 


/'(A) = C -j- = -^-/(X + AS) = S T V/(X + AS) 


A = (s 2 + s 2 + ■ ■ ■ + si) 1 ' 2 



h(X) — a bX -f- cX~ -1- A 2 


(5.45) 


m 



0 A 


B 


Figure 5.16 Minimum of /(A) lies between A and B. 


f In this case the angle between the direction of steepest descent and S will be less than 90°. 
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is used to approximate the function /(A) between points A and B, we need to find the 
values /a — /(A = A), f' A = df/dX(X — A), f B = /(A = B), and f' B — df/dX(X — 
B) in order to evaluate the constants, a , b, c, and d in Eq. (5.45). By assuming that 
A ^ 0, we can derive a general formula for A*. From Eq. (5.45) we have 

f a — a bA 4" cA“ d A~ 

f B — ci bB A cB~ 4~ d B 

f' A =b + 2cA + 3 dA 2 

f' B =b + 2cB +MB 2 (5.46) 

Equations (5.46) can be solved to find the constants as 

a — fA — bA — cA 2 — dA 3 (5.47) 

with 

b = ( A -fl)2 ( * 2 ^ + + M5Z) (5 ' 48) 

C = - 1 2 [(A + B)Z + Bf' A + A/'] (5.49) 

and 

^ 3(A^ ) 2 (2Z + /a + ^ ) (5-50) 

where 

Z = 3( g ;/ g) + /a + (5-51) 


The necessary condition for the minimum of h( A) given by Eq. (5.45) is that 

dh , 

— = b -\- 2,cX -j- 3 dX — 0 
d A 


that is, 


A* 


-c±(c 2 -3M) 1 / 2 

3rf 


(5.52) 


The application of the sufficiency condition for the minimum of h( A) leads to the 
relation 


r/ 2 /; 

t/A 2 


= 2c 4- 6dX* > 0 

1* 


(5.53) 


By substituting the expressions for b, c, and d given by Eqs. (5.48) to (5.50) into 
Eqs. (5.52) and (5.53), we obtain 


f A + z ± Q 

fA + fB+2Z 


( B-A ) 


A* = A 4- 


(5.54) 
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where 

2 = (Z 2 - f' A f' B ) l/2 
20 B ~ A)(2Z + f' A + f' B )(f' A + Z±Q) 

-2(5 - A)(f A 1 + Z/' +3 Zf A + 2 Z 2 ) 

—2(5 + A)f A f' B > 0 

By specializing Eqs. (5.47) to (5.56) for the case where A = 0, we obtain 


a = f a 

b = f' A 

c = -^ z + f' A ) 


d — + /a + /fl) 


X* — B 


ti + Z±Q 


f' A + f'n + 2Z 

e = (z 2 -/;/') 1 / 2 > o 


(5.55) 


(5.56) 


(5.57) 

(5.58) 


where 

Z = 3(/A ~ /fi) + /; + /« (5-59) 

The two values of X* in Eqs. (5.54) and (5.57) correspond to the two possibilities 
for the vanishing of h’{X) [i.e., at a maximum of h(X) and at a minimum]. To avoid 
imaginary values of Q, we should ensure the satisfaction of the condition 

Z 2 - f A f B > 0 


in Eq. (5.55). This inequality is satisfied automatically since A and 5 are selected 
such that f A < 0 and f' H > 0. Furthermore, the sufficiency condition (when A — 0) 
requires that Q > 0, which is already satisfied. Now we compute ).* using Eq. (5.57) 
and proceed to the next stage. 

Stage 4. The value of X* found in stage 3 is the true minimum of //(/,) and may 
not be close to the minimum of /(A). Hence the following convergence criteria can be 
used before choosing X* « X* : 


h(X*)~f(X*) 

/(**) 


df 

dX 


l* 


= I S T V/| X , | < £ 2 


(5.60) 


(5.61) 
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where E\ and £2 are small numbers whose values depend on the accuracy desired. The 
criterion of Eq. (5.61) can be stated in nondimensional form as 


s T v/ 

|S||v/i 


< £2 
l* 


(5.62) 


If the criteria stated in Eqs. (5.60) and (5.62) are not satisfied, a new cubic equation 


h'(X) = a' + b’X + c'X 2 + d'X 3 


can be used to approximate f(X). The constants a 1 , //, c', and d' can be evaluated 
by using the function and derivative values at the best two points out of the three 
points currently available: A, B, and X*. Now the general formula given by Eq. (5.54) 
is to be used for finding the optimal step size X*. If f'(X*) < 0, the new points A 
and B are taken as X* and B, respectively; otherwise [if f'(X*) > 0], the new points 
A and B are taken as A and X*, and Eq. (5.54) is applied to find the new value of 
X*. Equations (5.60) and (5.62) are again used to test for the convergence of X*. If 
convergence is achieved, X* is taken as X* and the procedure is stopped. Otherwise, 
the entire procedure is repeated until the desired convergence is achieved. 

The flowchart for implementing the cubic interpolation method is given in Fig. 5.17. 


Example 5.11 Find the minimum of / = A. 5 — 5/.'' — 20/. + 5 by the cubic interpola- 
tion method. 


SOLUTION Since this problem has not arisen during a multivariable optimization 
process, we can skip stage 1. We take A = 0 and find that 


— (A. = A = 0) = 5A 4 - 15X 2 - 20 
dX 


= -20 < 0 

A=0 


To find B at which df/dX is nonnegative, we start with to = 0.4 and evaluate the 
derivative at to, 2to, 4fo, This gives 

f(t 0 = 0.4) = 5(0. 4) 4 - 15(0. 4) 2 - 20.0 = -22.272 

f'(2t 0 = 0.8) = 5(0. 8) 4 - 15(0. 8) 2 - 20.0 = -27.552 

f(4t 0 = 1.6) = 5(1. 6) 4 - 15(1.6) 2 -20.0 = -25.632 

/'(8f 0 = 3.2) = 5(3.2) 4 - 15(3.2) 2 - 20.0 = 350.688 


Thus we find that^ 


A = 0.0, f A = 5.0, f A = -20.0 

B — 3.2, f B = 113.0, f B = 350.688 

A < X* < B 


f As f has been found to be negative at X = 1.6 also, we can take A = 1.6 for faster convergence. 
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Figure 5.17 Flowchart for cubic interpolation method. 
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Iteration 1 

To find the value of /■* and to test the convergence criteria, we first compute Z and Q 
as 

3(5.0 - 113.0) 

Z = — 20.0 + 350.688 = 229.588 

3.2 

Q = [229.588 2 + (20.0)(350.688)] 1/2 = 244.0 


Hence 


X* = 3.2 


-20.0 + 229.588 ± 244.0 


^-20.0 + 350.688 + 459.176 
By discarding the negative value, we have 

X* = 1.84 


= 1.84 or -0.1396 


Convergence criterion: If X* is close to the true minimum, X*, then f'(X*) — 
df(X*)/dX should be approximately zero. Since /' = 5/. 4 — 1 5 a 2 — 20, 

f(X*) = 5(1. 84) 4 - 15(1. 84) 2 - 20 = -13.0 

Since this is not small, we go to the next iteration or refitting. As f'(X*) < 0, we take 
A — X* and 


f A = f(X*) = (1.84) 5 -5(1. 84) 3 - 20(1.84) + 5 = -41.70 

Thus 

A — 1.84, f A = - 41.70, f' A — —13.0 

B = 3.2, f B = 113.0, f' B = 350.688 

A < X* < B 


Iteration 2 


Z = 


3 ( — 4 1 .7 - 113.0) 
3.20- 1.84 


- 13.0 + 350.688 = -3.312 


Q = [(— 3.312) 2 + (13.0) (350.688)] 1/2 = 67.5 


Hence 


X* = 1.84 + 


-13.0-3.312 + 67.5 
-13.0 + 350.688 - 6.624 


(3.2- 1.84) = 2.05 


Convergence criterion: 

f'(X*) = 5.0(2.05) 4 - 15.0(2.05) 2 - 20.0 = 5.35 


Since this value is large, we go the next iteration with B — X* — 2.05 [as f'(X*) > 0] 
and 


fs = (2.05) 5 - 5.0(2.05) 3 - 20.0(2.05) + 5.0 = -42.90 
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Thus 

A — 1.84, f A = - 41.70, f A = -13.00 

B = 2.05, f B = -42.90, f' B = 5.35 

A < X* < B 


Iteration 3 


Therefore, 


3.0(-41.70 + 42.90) 

z = — 7-: - 13.00 + 5.35 = 9.49 

(2.05 - 1.84) 

Q = [(9.49) 2 + (13.0)(5.35)] 1/2 = 12.61 


-13.00 + 9.49 + 12.61 

X = 1.84+ ... „ (2.05 - 1.84) = 2.0086 


-13.00 + 5.35 + 18.98 


Convergence criterion: 

/'(A *) = 5.0(2.0086) 4 - 15.0(2.0086) 2 - 20.0 = 0.855 
Assuming that this value is close to zero, we can stop the iterative process and take 

A* ~ X* = 2.0086 


5.12 DIRECT ROOT METHODS 

The necessary condition for /(A) to have a minimum of X* is that /' (A*) = 0. The 
direct root methods seek to find the root (or solution) of the equation, /'(A) = 0. Three 
root-finding methods — the Newton, the quasi-Newton, and the secant methods — are 
discussed in this section. 

5.12.1 Newton Method 

Consider the quadratic approximation of the function /(A) at A = A, using the Taylor’s 
series expansion: 

/(A) = / (A,-) + /'(A ,)( X - A,) + i/"(A,)(A - A,) 2 (5.63) 

By setting the derivative of Eq. (5.63) equal to zero for the minimum of /(A), we 
obtain 


/(A) = /'(A,) + /'U)(A - A,) = 0 (5.64) 

If Xi denotes an approximation to the minimum of /(A), Eq. (5.64) can be rearranged 
to obtain an improved approximation as 


/"(*,■) 


(5.65) 
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Thus the Newton method, Eq. (5.65), is equivalent to using a quadratic approximation 
for the function /(A.) and applying the necessary conditions. The iterative process given 
by Eq. (5.65) can be assumed to have converged when the derivative, /'(/.,+ 1 ), is close 
to zero: 


l/U+i)l < £ (5.66) 

where e is a small quantity. The convergence process of the method is shown graphi- 
cally in Fig. 5.18«. 

Remarks: 

1. The Newton method was originally developed by Newton for solving nonlinear 
equations and later refined by Raphson, and hence the method is also known as 
Newton-Raplison method in the literature of numerical analysis. 

2. The method requires both the first- and second-order derivatives of /(A). 

3. If /"(A,) ^ 0 [in Eq. (5.65)], the Newton iterative method has a powerful 
(fastest) convergence property, known as quadratic convergence 

4. If the starting point for the iterative process is not close to the true solution A*, 
the Newton iterative process might diverge as illustrated in Fig. 5.18Z?. 


rw 



Figure 5.18 Iterative process of Newton method: (a) convergence; ( b ) divergence. 


+ The definition of quadratic convergence is given in Section 6.7. 
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Example 5.12 Find the minimum of the function 

0.75 , 1 

/(A.) = 0.65 ? - 0.65 A tan -1 - 

1 + A 2 A 

using the Newton-Raphson method with the starting point Ai = 0.1. Use e = 0.01 in 
Eq. (5.66) for checking the convergence. 


SOLUTION The first and second derivatives of the function /(A) are given by 


f'M 

/"(A.) 


1.5A | 0.65A 

(1 + A 2 ) 2 + 1 + A 2 


— 0.65 tan 1 


1 

A 


1.5(1 -3A 2 ) | 0.65(1 -A 2 ) | 0.65 2.8-3.2A 2 

(1 + A 2 ) 3 + (1 + A 2 ) 2 + 1 + A 2 ~ (1 + A 2 ) 3 


Iteration 1 

A! = 0.1, /(AO = -0.188197, /'(AO = -0.744832, /"(AO = 2.68659 

/'( AO 

A 2 = Ar — = 0.377241 

/"(AO 

Convergence check: |/'(A 2 )| = |— 0.138230| >e. 

Iteration 2 


/( A 2 ) = -0.303279, /'(A 2 ) = -0.138230, /"(A 2 ) = 1.57296 

/'(AO 

A 3 = A 2 — 2 — — = 0.465119 

- /"(AO 

Convergence check: | /'(A 3 ) | = |— 0.0 179078 1 >e. 

Iteration 3 


/( A 3 ) = -0.309881, /'(A 3 ) = -0.0179078, /"(A 3 ) = 1.17126 

/'(A 3 ) 

A 4 = A 3 - 2 = 0.480409 

/"(A 3 ) 

Convergence check: | /'(A 4 ) | = | — 0.0005033 1 < e. 

Since the process has converged, the optimum solution is taken as A* ~ A 4 = 
0.480409. 

5.12.2 Quasi-Newton Method 

If the function being minimized /(A) is not available in closed form or is difficult to 
differentiate, the derivatives /'(A) and /"(A) in Eq. (5.65) can be approximated by the 
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finite difference formulas as 


/'(*«■) 


/"(*,-) 


+ AA) - /(A.,- - A A) 

2AA 

/ (A,- + AA) — 2/ (A./) + / (A,- — AA) 
AA 2 


(5.67) 

(5.68) 


where A A. is a small step size. Substitution of Eqs. (5.67) and (5.68) into Eq. (5.65) 
leads to 


A A[/(A,- + AA) - /(A.,- - AA)] 
2[/(A./ + AA) - 2/ (A./ ) + / (A,- - AA)] 


(5.69) 


The iterative process indicated by Eq. (5.69) is known as the quasi-Newton method. 
To test the convergence of the iterative process, the following criterion can be used: 


l/U+01 = 


/ (Af +1 + AA) - / (A, +1 - AA) 
2AA 


< e 


(5.70) 


where a central difference formula has been used for evaluating the derivative of / 
and £ is a small quantity. 

Remarks: 


1. The central difference formulas have been used in Eqs. (5.69) and (5.70). How- 
ever, the forward or backward difference formulas can also be used for this 
purpose. 

2 . Equation (5.69) requires the evaluation of the function at the points A, + A A 
and A ,■ — AA in addition to A, in each iteration. 


Example 5.13 Find the minimum of the function 

0.75 , 1 

/(A) = 0.65 - 0.65Atan _1 - 

J 1 + A 2 A 

using quasi-Newton method with the starting point Ai =0.1 and the step size A A = 
0.01 in central difference formulas. Use e = 0.01 in Eq. (5.70) for checking the con- 
vergence. 

SOLUTION 
Iteration 1 


Aj =0.1, AA = 0.01, £=0.01, /i = /(Ai) = -0.188197, 

/+ = /( Aj + AA) = -0.195512, f~ = /( Ai - AA) = -0.180615 

A^(/i + - /D 


A2 = Ai 


Convergence check: 


2(/+-2/ 1 +/f) 


fi + - fi 


= 0.377882 


1/^2) I = 


2AA 


= 0.137300 >£ 
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Iteration 2 


fi = fi'Xi) = -0.303368, /+ = f(X 2 + AX) = -0.304662, 

/“ = f(X 2 - AX) = -0.301916 


^3 = X 2 


A X(f+ - f 2 ~) 

2 (/+ - 2/ 2 + /") 


= 0.465390 


Convergence check: 


I/'(a. 3 )I = 


/ 3 + ~ / 3 ~ 
2AX 


= 0.017700 >e 


Iteration 3 


h = /(X 3 ) = -0.309885, /+ = / (X 3 + A/) = -0.310004, 

// = /(A.3 - AX) = -0.309650 

A X(f+ - / 3 “) 


A-4 = A. 3 

Convergence check: 


\f\U)\ - 


2(/+ - 2/3 + / 3 “) 

A + - 4 


= 0.480600 


2AA 


= 0.000350 < e 


Since the process has converged, we take the optimum solution as X* « X 4 — 0.480600. 


5.12.3 Secant Method 

The secant method uses an equation similar to Eq. (5.64) as 

f(X) = f'(Xi)+s(X-Xi) = 0 (5.71) 


where s is the slope of the line connecting the two points (A, f'(A)) and ( B , f'(B)), 
where A and B denote two different approximations to the correct solution, X*. The 
slope 5 can be expressed as (Fig. 5.19) 


f(B) - f(A) 
B - A 


(5-72) 


Equation (5.71) approximates the function f(X) between A and fi as a linear equation 
(secant), and hence the solution of Eq. (5.71) gives the new approximation to the root 
of f(X) as 


/'(a,) 

Ai+i = a, - = A - 


fjA){B - A) 
f(B) - /'(A) 


(5.73) 


The iterative process given by Eq. (5.73) is known as the secant method (Fig. 5.19). 
Since the secant approaches the second derivative of f(X) at A as fi approaches A, 
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f\k) 



Figure 5.19 Iterative process of the secant method. 


the secant method can also be considered as a quasi-Newton method. It can also be 
considered as a form of elimination technique since part of the interval, (A, A., + i) in 
Fig. 5.19, is eliminated in every iteration. The iterative process can be implemented by 
using the following step-by-step procedure. 


1. Set Xi — A — 0 and evaluate f'(A). The value of /'(A) will be negative. 
Assume an initial trial step length to- Set i — 1. 

2. Evaluate /'(to). 

3. If /'( f 0 ) < 0, set A — ki — t 0 , /'(A) = /'(to), new t 0 = 2 1 0 , and go to step 2. 

4. If /'(to) > 0, set B — t 0 , f\B) = /'(to), and go to step 5. 

5. Find the new approximate solution of the problem as 


7-/+1 — A — 


f(A)(B - A) 
f(B) - /'(A) 


(5.74) 


6 . Test for convergence: 


l/U- + 1)1 < e 


(5.75) 


where e is a small quantity. If Eq. (5.75) is satisfied, take k* ~ k, + \ and stop 
the procedure. Otherwise, go to step 7. 

7. If /'(A, i+ i) > 0, set new B — A., + 1 , f'(B) = /'(A., + 1 ), i = i + 1, and go to 
step 5. 

8. If f\k i+ 1 ) < 0, set new A = /'(A) = f'(k i+ 1 ), i = i + 1, and go to 

step 5. 
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fU) 



Figure 5.20 Situation when f' A varies very slowly. 


Remarks: 

1. The secant method is identical to assuming a linear equation for /'(A.). This 
implies that the original function, /(A), is approximated by a quadratic equation. 

2. In some cases we may encounter a situation where the function /'(A) varies 
very slowly with A, as shown in Fig. 5.20. This situation can be identified 
by noticing that the point B remains unaltered for several consecutive refits. 
Once such a situation is suspected, the convergence process can be improved 
by taking the next value of A,+i as (A + B)I2 instead of finding its value from 
Eq. (5.74). 

Example 5.14 Find the minimum of the function 

0.75 , 1 

/(A) = 0.65 ^ - 0.65Atan“ - 

1 + A z A 

using the secant method with an initial step size of to = 0.1, Ai = 0.0, and e = 0.01. 

SOLUTION — A — 0.0, f 0 = 0.1, /'(+) = —1.02102, B=A + t 0 = 0.1, 

f(B) = -0.744832. Since f'(B) < 0, we set new A =0.1, f(A) = -0.744832, t 0 = 
2(0.1) = 0.2, B — Xi + to — 0.2, and compute f'(B) — —0.490343. Since f(B) < 0, 
we set new A = 0.2, f(A) = -0.490343, t 0 = 2(0.2) = 0.4, B = Aj + t 0 = 0.4, 
and compute f'(B) = —0.103652. Since f'(B) < 0, we set new A = 0.4, f'(A) = 
-0.103652, t 0 = 2(0.4) = 0.8, B = Aj + 1 0 = 0.8, and compute f'(B) = +0.180800. 
Since f'(B) > 0, we proceed to find A 2 . 


Iteration 1 


Since A = Ai = 0.4, f'(A ) = —0.103652, B = 0.8, f'(B) = +0.180800, we compute 


, „ f'(A)(B — A) 

A? = A — 

f(B) - f'(A) 


0.545757 


Convergence check: |/'(A 2 )| = |+0.0105789| >e. 
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Iteration 2 


Since f'(k 2 ) — +0.0105789 > 0, we set new A = 0.4, f\A) = —0.103652, B = k 2 — 
0.545757, f'(B) = f'(k 2 ) = +0.0105789, and compute 


_ /'(+)(* -A) 
f'(B) - /'(A) 


0.490632 


Convergence check: \f'(k 2 )\ = | +0.001 5 1 235 1 < e. 

Since the process has converged, the optimum solution is given by k* A 3 = 
0.490632. 


5.13 PRACTICAL CONSIDERATIONS 

5.13.1 How to Make the Methods Efficient and More Reliable 

In some cases, some of the interpolation methods discussed in Sections 5.10 to 5.12 
may be very slow to converge, may diverge, or may predict the minimum of the func- 
tion, f{k), outside the initial interval of uncertainty, especially when the interpolating 
polynomial is not representative of the variation of the function being minimized. In 
such cases we can use the Fibonacci or golden section method to find the minimum. In 
some problems it might prove to be more efficient to combine several techniques. For 
example, the unrestricted search with an accelerated step size can be used to bracket 
the minimum and then the Fibonacci or the golden section method can be used to find 
the optimum point. In some cases the Fibonacci or golden section method can be used 
in conjunction with an interpolation method. 


5.13.2 Implementation in Multivariable Optimization Problems 

As stated earlier, the one-dimensional minimization methods are useful in multivariable 
optimization problems to find an improved design vector X, + i from the current design 
vector X, using the formula 

X ,- +I = X,- + A.*S,- (5.76) 

where S ,■ is the known search direction and k* is the optimal step length found by 
solving the one-dimensional minimization problem as 

k* = min L/(X, + A,S,)j (5.77) 

Here the objective function / is to be evaluated at any trial step length to as 

/«b) = /(X/+f 0 S/) (5.78) 


Similarly, the derivative of the function / with respect to k corresponding to the trial 
step length to is to be found as 


df 


dk 


X=tg 


SjAf\ x=t0 


(5.79) 


Separate function programs or subroutines can be written conveniently to implement 
Eqs. (5.78) and (5.79). 
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5.13.3 Comparison of Methods 

It has been shown in Section 5.9 that the Fibonacci method is the most efficient elimina- 
tion technique in finding the minimum of a function if the initial interval of uncertainty 
is known. In the absence of the initial interval of uncertainty, the quadratic interpo- 
lation method or the quasi-Newton method is expected to be more efficient when the 
derivatives of the function are not available. When the first derivatives of the function 
being minimized are available, the cubic interpolation method or the secant method are 
expected to be very efficient. On the other hand, if both the first and second derivatives 
of the function are available, the Newton method will be the most efficient one in 
finding the optimal step length, k*. 

In general, the efficiency and reliability of the various methods are problem depen- 
dent and any efficient computer program must include many heuristic additions not 
indicated explicitly by the method. The heuristic considerations are needed to handle 
multimodal functions (functions with multiple extreme points), sharp variations in the 
slopes (first derivatives) and curvatures (second derivatives) of the function, and the 
effects of round-off errors resulting from the precision used in the arithmetic opera- 
tions. A comparative study of the efficiencies of the various search methods is given in 
Ref. [5.10]. 


5.14 MATLAB SOLUTION OF ONE-DIMENSIONAL 
MINIMIZATION PROBLEMS 

The solution of one-dimensional minimization problems, using the MATLAB program 
optimset, is illustrated by the following example. 

Example 5.15 Find the minimum of the following function: 

0.75 / 1 \ 

f(x) — 0.65 r- — 0.65x tan ( — ) 

1 + x l \x J 


SOLUTION 

Step 1 : Write an M-file objfun.m for the objective function. 

function f= objfun(x) 

f= 0 . 65- (0.75/ ( l+x A 2 ) ) -0 . 65*x*atan ( 1 /x) ; 

Step 2 : Invoke unconstrained optimization program (write this in new MATLAB 
hie). 

clc 

clear all 
warning off 

options = optimset (' LargeScale of f ') ; 

[x,fval] = fminbnd (0objfun, 0, 0 . 5, options) 
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This produces the solution or ouput as follows: 


x= 

0 .4809 
fval = 

- 0.3100 
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REVIEW QUESTIONS 

5.1 What is a one-dimensional minimization problem? 

5.2 What are the limitations of classical methods in solving a one-dimensional minimization 
problem? 

5.3 What is the difference between elimination and interpolation methods? 

5.4 Define Fibonacci numbers. 

5.5 What is the difference between Fibonacci and golden section methods? 

5.6 What is a unimodal function? 

5.7 What is an interval of uncertainty? 

5.8 Suggest a method of finding the minimum of a multimodal function. 

5.9 What is an exhaustive search method? 
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5.10 What is a dichotomous search method? 

5.11 Define the golden mean. 

5.12 What is the difference between quadratic and cubic interpolation methods? 

5.13 Why is refitting necessary in interpolation methods? 

5.14 What is a direct root method? 

5.15 What is the basis of the interval halving method? 

5.16 What is the difference between Newton and quasi-Newton methods? 

5.17 What is the secant method? 

5.18 Answer true or false: 

(a) A unimodal function cannot be discontinuous. 

(b) All elimination methods assume the function to be unimodal. 

(c) The golden section method is more accurate than the Fibonacci method. 

(d) Nearly 50% of the interval of uncertainty is eliminated with each pair of experiments 
in the dichotomous search method. 

(e) The number of experiments to be conducted is to be specified beforehand in both the 
Fibonacci and golden section methods. 


PROBLEMS 

5.1 Find the minimum of the function 

0.75 1 

f(x) = 0.65 T - — 0.65-* tan - 

1 + x l x 

using the following methods: 

(a) Unrestricted search with a fixed step size of 0. 1 from the starting point 0.0 

(b) Unrestricted search with an accelerated step size using an initial step size of 0.1 and 
starting point of 0.0 

(c) Exhaustive search method in the interval (0, 3) to achieve an accuracy of within 5% 
of the exact value 

(d) Dichotomous search method in the interval (0, 3) to achieve an accuracy of within 
5% of the exact value using a value of S = 0.0001 

(e) Interval halving method in the interval (0, 3) to achieve an accuracy of within 5% of 
the exact value 

5.2 Find the minimum of the function given in Problem 5.1 using the quadratic interpolation 
method with an initial step size of 0.1. 

5.3 Find the minimum of the function given in Problem 5.1 using the cubic interpolation 
method with an initial step size of to = 0.1. 

5.4 Plot the graph of the function f(x) given in Problem 5.1 in the range (0, 3) and identify 
its minimum. 
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5.5 


The shear stress induced along the z-axis when two cylinders are in contact with each 
other is given by 



Pmax 




*i/ ,+ (!) ! - 2 (!) (1) 

where 2b is the width of the contact area and p max is the maximum pressure developed 
at the center of the contact area (Fig. 5.21): 


b = 


2 F Ei 


1 - V 1 + 1 - V 2 \ 


1/2 


E 2 


71 1 


1 1 
d\ d.2 
IF 


Pmax — 


Ttbl 


( 2 ) 

( 3 ) 


F is the contact force; E\ and £2 are Young’s moduli of the two cylinders; v\ and V 2 are 
Poisson’s ratios of the two cylinders; d\ and d 2 the diameters of the two cylinders, and I 
the axial length of contact (length of the shorter cylinder). In many practical applications, 


F 



Figure 5.21 Contact stress between two cylinders. 
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5.6 

5.7 


5.8 

5.9 

5.10 

5.11 

5.12 

5.13 


such as roller bearings, when the contact load (F) is large, a crack originates at the point 
of maximum shear stress and propagates to the surface leading to a fatigue failure. To 
locate the origin of a crack, it is necessary to find the point at which the shear stress 
attains its maximum value. Show that the problem of finding the location of the maximum 
shear stress for v\ = V 2 = 0-3 reduces to maximizing the function 


where / = T zy /p mm and X = z/b. 

Plot the graph of the function f(X) given by Eq. (4) in Problem 5.5 in the range (0, 3) 
and identify its maximum. 

Find the maximum of the function given by Eq. (4) in Problem 5.5 using the following 
methods: 

(a) Unrestricted search with a fixed step size of 0. 1 from the starting point 0.0 

(b) Unrestricted search with an accelerated step size using an initial step length of 0.1 
and a starting point of 0.0 

(c) Exhaustive search method in the interval (0, 3) to achieve an accuracy of within 5% 
of the exact value 

(d) Dichotomous search method in the interval (0, 3) to achieve an accuracy of within 
5% of the exact value using a value of S = 0.0001 

(e) Interval halving method in the interval (0, 3) to achieve an accuracy of within 5% 
of the exact value 

Find the maximum of the function given by Eq. (4) in Problem 5.5 using the following 
methods: 

(a) Fibonacci method with n = 8 

(b) Golden section method with n = 8 

Find the maximum of the function given by Eq. (4) in Problem 5.5 using the quadratic 
interpolation method with an initial step length of 0.1. 

Find the maximum of the function given by Eq. (4) in Problem 5.5 using the cubic 
interpolation method with an initial step length of to = 0.1. 

Find the maximum of the function f(X) given by Eq. (4) in Problem 5.5 using the 
following methods: 

(a) Newton method with the starting point 0.6 

(b) Quasi-Newton method with the starting point 0.6 and a finite difference step size of 


(c) Secant method with the starting point ki = 0.0 and to = 0.1 
Prove that a convex function is unimodal. 

Compare the ratios of intervals of uncertainty (L„/Lq) obtainable in the following meth- 
ods for n = 2,3, ... , 10: 



( 4 ) 


(a) Exhaustive search 

(b) Dichotomous search with S = 10 4 


(C) Interval halving method 

(d) Fibonacci method 

(e) Golden section method 
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5.14 Find the number of experiments to be conducted in the following methods to obtain a 
value of L„/Lq = 0.001: 

(a) Exhaustive search 

(b) Dichotomous search with <5 = 10 4 

(c) Interval halving method 

(d) Fibonacci method 

(e) Golden section method 

5.15 Find the value of x in the interval (0, 1) which minimizes the function / = x(x — 1.5) 
to within ±0.05 by (a) the golden section method and (b) the Fibonacci method. 

5.16 Find the minimum of the function f = X 5 — 5A 3 — 20X + 5 by the following methods: 

(a) Unrestricted search with a fixed step size of 0. 1 starting from X = 0.0 

(b) Unrestricted search with accelerated step size from the initial point 0.0 with a starting 
step length of 0.1 

(c) Exhaustive search in the interval (0, 5) 

(d) Dichotomous search in the interval (0, 5) with S = 0.0001 

(e) Interval halving method in the interval (0, 5) 

(f) Fibonacci search in the interval (0. 5) 

(9) Golden section method in the interval (0, 5) 

5.17 Find the minimum of the function / = (A/log X) by the following methods (take the 
initial trial step length as 0.1): 

(a) Quadratic interpolation method 

(b) Cubic interpolation method 

5.18 Find the minimum of the function / = X/log X using the following methods: 

(a) Newton method 

(b) Quasi-Newton method 

(c) Secant method 

5.19 Consider the function 

2x\ + 2x| + 3x| — 2x\X2 — 2x2*3 
^ x\ + x\ + 2x| 

Substitute X = X i + XS into this function and derive an exact formula for the minimizing 
step length X*. 

5.20 Minimize the function / = x\ — x 2 + 2x\ + 2 x\X2 + x\ starting from the point Xi = {[]} 
along the direction S = { ~ l Q j using the quadratic interpolation method with an initial step 
length of 0.1. 
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5.21 Consider the problem 

Minimize /(X) = lOOfe — xj) 2 + (1 — x\) 2 

and the starting point, X; = {“[}. Find the minimum of /(X) along the direction. Si = 
{g} using quadratic interpolation method. Use a maximum of two refits. 

5.22 Solve Problem 5.21 using the cubic interpolation method. Use a maximum of two refits. 

5.23 Solve Problem 5.21 using the direct root method. Use a maximum of two refits. 

5.24 Solve Problem 5.21 using the Newton method. Use a maximum of two refits. 

5.25 Solve Problem 5.21 using the Fibonacci method with Lq = (0, 0.1). 

5.26 Write a computer program, in the form of a subroutine, to implement the Fibonacci 
method. 

5.27 Write a computer program, in the form of a subroutine, to implement the golden section 
method. 

5.28 Write a computer program, in the form of a subroutine, to implement the quadratic 
interpolation method. 

5.29 Write a computer program, in the form of a subroutine, to implement the cubic interpo- 
lation method. 

5.30 Write a computer program, in the form of a subroutine, to implement the secant method. 

5.31 Find the maximum of the function given by Eq. (4) in Problem 5.5 using MATLAB. 
Assume the bounds on A as 0 and 3. 

5.32 Find the minimum of the function f(L) given in Problem 5.16, in the range 0 and 5, using 
MATLAB. 

5.33 Find the minimum of f(x ) = x(x — 1.5) in the interval (0, 1) using MATLAB. 

5.34 Find the minimum of the function f(x) = — 22^ in the range (0, 10) using MATLAB. 

5.35 Find the minimum of the function f (x) = x 3 + x 2 — x — 2 in the interval —4 and 4 using 
MATLAB. 

5.36 Find the minimum of the function f(x) = — — + 6(1 ° '' in the interval —4 and 4 using 
MATLAB. 
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Nonlinear Programming II: 
Unconstrained Optimization 
Techniques 


6.1 INTRODUCTION 

This chapter deals with the various methods of solving the unconstrained minimization 
problem: 


Find X = 


x\ 

X2 


which minimizes /(X) 


Xn 


(6.1) 


It is true that rarely a practical design problem would be unconstrained; still, a study 
of this class of problems is important for the following reasons: 

1. The constraints do not have significant influence in certain design problems. 

2. Some of the powerful and robust methods of solving constrained minimization 
problems require the use of unconstrained minimization techniques. 

3. The study of unconstrained minimization techniques provide the basic under- 
standing necessary for the study of constrained minimization methods. 

4. The unconstrained minimization methods can be used to solve certain complex 
engineering analysis problems. For example, the displacement response (linear 
or nonlinear) of any structure under any specified load condition can be found 
by minimizing its potential energy. Similarly, the eigenvalues and eigenvectors 
of any discrete system can be found by minimizing the Rayleigh quotient. 

As discussed in Chapter 2, a point X* will be a relative minimum of /(X) if the 
necessary conditions 


-C(X=X*) = 0, 1=1,2, ...,n 

dxi 


(6.2) 


Engineering Optimization: Theory and Practice, Fourth Edition Si ngi resu S. Rao 
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are satisfied. The point X* is guaranteed to be a relative minimum if the Hessian matrix 
is positive definite, that is, 


J x* = Mx* 


' 9 2 / 

_dxi 3 Xj 


(X*) 


= positive definite 


(6.3) 


Equations (6.2) and (6.3) can be used to identify the optimum point during numerical 
computations. However, if the function is not differentiable, Eqs. (6.2) and (6.3) cannot 
be applied to identify the optimum point. For example, consider the function 

I ax for x > 0 
^ | — bx for x < 0 


where a > 0 and b > 0. The graph of this function is shown in Fig. 6.1. It can be 
seen that this function is not differentiable at the minimum point, x* — 0, and hence 
Eqs. (6.2) and (6.3) are not applicable in identifying x*. In all such cases, the commonly 
understood notion of a minimum, namely, /(X*) < /(X) for all X, can be used only 
to identify a minimum point. The following example illustrates the formulation of a 
typical analysis problem as an unconstrained minimization problem. 


Example 6.1 A cantilever beam is subjected to an end force Po and an end moment 
Mo as shown in Fig. 6.2 a. By using a one-finite-element model indicated in Fig. 6.2 b, 
the transverse displacement, w(x), can be expressed as [6.1] 


w(x) — {Ni(x) N 2 (x) Ni(x) A 4 (x)} 


U\ 

U 2 

W 3 

U4 


where Nj(x ) are called shape functions and are given by 

N 1 (x) = 2ct 3 — 3ct 2 + 1 
N 2 (x) = (a 3 — 2 cr + a)l 
/V 3 (x) = —2a 3 + 3 a 2 
A 4 (x) = (a 3 — a 2 )/ 


(Ei) 


(E 2 ) 

(E 3 ) 

(E 4 ) 

(Es) 


a = x/l, and u \ , u 2, M3, and m 4 are the end displacements (or slopes) of the beam. 
The deflection of the beam at point A can be found by minimizing the potential energy 


f(x) = -bx 



0 


-► x 


Figure 6.1 Function is not differentiable at mini- 
mum point. 
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(a) 



( b ) 

Figure 6.2 Finite-element model of a cantilever beam. 


of the beam ( F ), which can be expressed as [6.1] 


F 



dx — P Q U 3 — MqU 4 


(Eg) 


where E is Young’s modulus and I is the area moment of inertia of the beam. Formulate 
the optimization problem in terms of the variables x\ — u 3 and xi — 114 I for the case 
Pol 3 /El = 1 and M Q l 2 /EI = 2. 


SOLUTION Since the boundary conditions are given by U{ — u 3 — 0, w(x) can be 
expressed as 

w(x) — (—2a 3 + 3a 2 )u 3 + (a 3 — a 2 )lu 4 (E 7 ) 


d 2 w 

dx 2 


6w r 2 m 4 

= 2a + 1) + -j-(3a - 1) 


(Eg) 


so that 
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Equation (Eg) can be rewritten as 

F = — I El | — — — I l da — Pqu^ — MqU4 


Ell 

— 

El 


d 2 wY 
dx 2 ) 

1 T 6m 3 


/ 

Jo L 


/2 (- 2 a + l) + ^( 3 o-l) 


dot — P0U3 — Mq«4 


— -jj- (6t<3 + 2 u\l 2 — 6M3M4/) — P0M3 — MqU4 


(Eg) 


By using the relations M3 = x\, U4I = X 2 , Pq1 3 /EI = 1 , and MqI 2 /El — 2, and intro- 
ducing the notation / = FI 3 /El, Eq. (Eg) can be expressed as 

/ = 6x 2 - 6x\x 2 + 2x\ — xi - 2x2 (E10) 


Thus the optimization problem is to determine x\ and X2, which minimize the function 
/ given by Eq. (E 10 ). 


6.1.1 Classification of Unconstrained Minimization Methods 

Several methods are available for solving an unconstrained minimization problem. 
These methods can be classified into two broad categories as direct search methods 
and descent methods as indicated in Table 6. 1 . The direct search methods require only 
the objective function values but not the partial derivatives of the function in Ending 
the minimum and hence are often called the nongradient methods. The direct search 
methods are also known as zeroth-order methods since they use zeroth-order derivatives 
of the function. These methods are most suitable for simple problems involving a 
relatively small number of variables. These methods are, in general, less efficient than 
the descent methods. The descent techniques require, in addition to the function values, 
the first and in some cases the second derivatives of the objective function. Since 
more information about the function being minimized is used (through the use of 
derivatives), descent methods are generally more efficient than direct search techniques. 
The descent methods are known as gradient methods. Among the gradient methods, 


Table 6.1 Unconstrained Minimization Methods 


Direct search methods" 

Descent methods* 

Random search method 

Steepest descent (Cauchy) method 

Grid search method 

Fletcher- Reeves method 

Univariate method 

Newton’s method 

Pattern search methods 

Marquardt method 

Powell’s method 

Quasi-Newton methods 

Simplex method 

Davidon-Fletcher-Powell method 

Broy den -Fletcher- Goldfarb - Shanno method 


“Do not require the derivatives of the function. 
^Require the derivatives of the function. 
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those requiring only first derivatives of the function are called first-order methods ; those 
requiring both first and second derivatives of the function are termed second-order 
methods . 


6.1.2 General Approach 

All the unconstrained minimization methods are iterative in nature and hence they start 
from an initial trial solution and proceed toward the minimum point in a sequential 
manner as shown in Fig. 5.3. The iterative process is given by 

X i+l =X i+ X*Si (6.4) 

where X, is the starting point, S; is the search direction, X* is the optimal step length, 
and X, + i is the final point in iteration i. It is important to note that all the unconstrained 
minimization methods (1) require an initial point X ] to start the iterative procedure, 
and (2) differ from one another only in the method of generating the new point X, +i 
(from X,) and in testing the point X (+ | for optimality. 


6.1.3 Rate of Convergence 

Different iterative optimization methods have different rates of convergence. In general, 
an optimization method is said to have convergence of order p if [6.2] 


l|X,-+i -X*|| 

< k, 

l|X,--X*p “ 


k > 0, p > 1 


(6.5) 


where X, and X,- + i denote the points obtained at the end of iterations i and i + 1, 
respectively, X* represents the optimum point, and ||X|| denotes the length or norm of 
the vector X: 


X II — + x 2 + ' ' ' + 


If p — 1 and 0 < k < 1, the method is said to be linearly convergent (corresponds 
to slow convergence). If p — 2, the method is said to be quadratically convergent 
(corresponds to fast convergence). An optimization method is said to have superlinear 
convergence (corresponds to fast convergence) if 


lim 

i — > oo 


lix,- +1 - X* 
Iix,--X*|| 


0 


(6.6) 


The definitions of rates of convergence given in Eqs. (6.5) and (6.6) are applica- 
ble to single-variable as well as multivariable optimization problems. In the case of 
single-variable problems, the vector, X,, for example, degenerates to a scalar, jq. 


6.1.4 Scaling of Design Variables 

The rate of convergence of most unconstrained minimization methods can be improved 
by scaling the design variables. For a quadratic objective function, the scaling of the 


306 


Nonlinear Programming II: Unconstrained Optimization Techniques 


design variables changes the condition number^ of the Hessian matrix. When the con- 
dition number of the Hessian matrix is 1, the steepest descent method, for example, 
finds the minimum of a quadratic objective function in one iteration. 

If / = ^X T [A]X denotes a quadratic term, a transformation of the form 


X = [R] Y or 


J *1 


r \\ 

*12 

K 

{*2 


/21 

*22_ 

1>’2 


can be used to obtain a new quadratic term as 

iY T [A]Y = IY t [I?] t [A][/?]Y 


(6.7) 


( 6 . 8 ) 


The matrix [ R\ can be selected to make [A] = [ A'| T [/\ j[ R\ diagonal (i.e., to eliminate 
the mixed quadratic terms). For this, the columns of the matrix [7?] are to be chosen 
as the eigenvectors of the matrix [A]. Next the diagonal elements of the matrix [A] 
can be reduced to 1 (so that the condition number of the resulting matrix will be 1) by 
using the transformation 



l' Vl 


'511 

0 ■ 

f-1 1 

or 

\yi. 

— 

. 0 

522. 

u 


(6.9) 


where the matrix [5j is given by 


[S] = 


511 


1 


(a ii 

0 522 


! a 22 . 


( 6 . 10 ) 


Thus the complete transformation that reduces the Hessian matrix of / to an identity 
matrix is given by 


X = [*][S]Z = [T]Z (6.11) 

so that the quadratic term ^X T [A]X reduces to Z T [/]Z. 

If the objective function is not a quadratic, the Hessian matrix and hence the 
transformations vary with the design vector from iteration to iteration. For example. 


+ The condition number of an n x n matrix, [A], is defined as 

cond([A]) = || [A] || ||[A]- 1 ||>1 

where ||[A]|| denotes a norm of the matrix [A]. For example, the infinite norm of [A] is defined as the 
maximum row sum given by 


||[A]||oo = max V |ay | 

~~ j = 1 

If the condition number is close to 1, the round-off errors are expected to be small in dealing with the 
matrix [A]. For example, if condfA] is large, the solution vector X of the system of equations [A]X = B is 
expected to be very sensitive to small variations in [A] and B. If cond[A] is close to 1, the matrix [A] is 
said to be well behaved or well conditioned . On the other hand, if condfA] is significantly greater than 1, 
the matrix [A] is said to be not well behaved or ill conditioned . 
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the second-order Taylor’s series approximation of a general nonlinear function at the 
design vector X,- can be expressed as 

/(X) = c + B T X + ±X T [A]X (6.12) 

where 


c = /(X,) 


(6.13) 


_ 9 / 

dxi 


X, 


df_ 

dx n 


X, 


(6.14) 


[A] = 


9V 

dx 2 


d 2 f 


X, 


dx n dx\ 


d 2 f 


dx\dx yl 


9V 

dx 2 


X ; 


(6.15) 


The transformations indicated by Eqs. (6.7) and (6.9) can be applied to the matrix [A] 
given by Eq. (6.15). The procedure of scaling the design variables is illustrated with 
the following example. 


Example 6.2 Find a suitable scaling (or transformation) of variables to reduce the 
condition number of the Hessian matrix of the following function to 1 : 

f(x i, x 2 ) = 6x 2 - 6 xix 2 + 2x 2 - xi - lx 2 (E[) 


SOLUTION The quadratic function can be expressed as 

/(X) = B T X + IX t [A]X (E 2 ) 


where 



and 



-6 

4 


As indicated above, the desired scaling of variables can be accomplished in two 
stages. 


Stage 1: Reducing [A] to a Diagonal Form, [A] 

The eigenvectors of the matrix [A] can be found by solving the eigenvalue problem 

[[A] — A, [/]] u, = 0 (E 3 ) 
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where A is the / th eigenvalue and u, is the corresponding eigenvector. In the present 
case, the eigenvalues, A.,- , are given by 


12 - A, - 6 
-6 4 — A,- 


= A? — 16 A,- + 12 = 0 


(E 4 ) 


which yield Ai = 8 + \/52 = 15.2111 and A 2 = 8 — V52 = 0.7889. The eigenvector 
U, corresponding to A; can be found by solving Eq. (E 3 ): 


or (12 — Ai)mh — 61/21 = 0 
or M21 = — 0.5332hh 


'12 - Ai 

-6' 

|m 11 


-6 

l 

1 

1«21 . 

l"{o] 


that is. 


and 


U] = 


|mh 

If 10 

1 m 21 . 

1-0.5332 


'12 -A 2 

-6' 

ju 12 

_(°1 

-6 

4 — A2_ 

}m22 

1" W 


or (12 — A 2 )mi 2 
or U 22 — 1 .8685m 12 


6 m 22 = 0 


that is. 


U 2 - 


«12 

«22 


1.0 

1.8685 


Thus the transformation that reduces [A] to a diagonal form is given by 

X = [*]Y = [u, u 2 ]Y = 


1 

r 

(.V! 

—0.5352 

1.8685. 

( V2 


(E 5 ) 


that is, 


xi =yi + y2 

X 2 = -0.5352^ + 1.8685y 2 


This yields the new quadratic term as 4 Y t [A]Y, where 


[A] = [7?] t [A][/?] = 


'19.5682 0.0 
0.0 3.5432 


and hence the quadratic function becomes 

/(Ti,J 2 ) = B T [fl]Y + ±Y T [i]Y 

= 0.0704yi - 4.7370y 2 + £(19.8682)y? + ^(3.5432 )y| 


(E 6 ) 


Stage 2: Reducing [A] to a Unit Matrix 
The transformation is given by Y = [,S'JZ , where 
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1 




V19.5682 

0 


'0.2262 0.0 

0 

1 


0.0 0.5313 


V3.5432J 




Stage 3: Complete Transformation 
The total transformation is given by 


where 


or 


X = [R] Y = [R][S] Z = [T]Z 


m = [R][S] = 


1 1 

-0.5352 1.8685 

0.2262 0.5313 
-0.1211 0.9927 


0.2262 0 

0 0.5313 


X! = 0.2262Z! +0.5313z 2 


x 2 = -0.121 lzi +0.9927 z 2 


(E 7 ) 


(Eg) 


With this transformation, the quadratic function of Eq. (Ei) becomes 

f(z i,z 2 ) = b t [T]Z + ^z T mV][r]Z 

= 0.0160^! — 2.5167z 2 + (E 9 ) 

The contours of the quadratic functions given by Eqs. (Ej), (Eg), and (Eg) are shown 
in Fig. 63a, b, and c, respectively. 


Direct Search Methods 


6.2 RANDOM SEARCH METHODS 

Random search methods are based on the use of random numbers in Ending the min- 
imum point. Since most of the computer libraries have random number generators, 
these methods can be used quite conveniently. Some of the best known random search 
methods are presented in this section. 
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2) 



(a) 


fiyvVz) 



(b) 

Figure 6.3 Contours of the original and transformed functions. 
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6.2.1 Random J umping Method 

Although the problem is an unconstrained one, we establish the bounds /, and m, for 
each design variable x,,i — 1,2, ... ,n, for generating the random values of x , : 

li < Xj < Uj, i = 1, 2, . . . , n (6.16) 

In the random jumping method, we generate sets of n random numbers, ( r \ , ri , . . . , r n ), 
that are uniformly distributed between 0 and 1. Each set of these numbers is used to 
find a point, X, inside the hypercube defined by Eqs. (6.16) as 


x\ 

X2 


l\ +r\(,u\ — l\) 
h + ri(ui — h) 

X>i 


J-n r n (Uji In) 


(6.17) 


and the value of the function is evaluated at this point X. By generating a large number 
of random points X and evaluating the value of the objective function at each of these 
points, we can take the smallest value of /(X) as the desired minimum point. 
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6.2.2 Random Walk Method 

The random walk method is based on generating a sequence of improved approxima- 
tions to the minimum, each derived from the preceding approximation. Thus if X, is 
the approximation to the minimum obtained in the (i — l )th stage (or step or iteration), 
the new or improved approximation in the rth stage is found from the relation 

X;+ 1 = X; + All; (6.18) 


where A is a prescribed scalar step length and U; is a unit random vector generated in the 
r'th stage. The detailed procedure of this method is given by the following steps [6.3]: 

1. Start with an initial point X i , a sufficiently large initial step length A, a minimum 
allowable step length e, and a maximum permissible number of iterations N. 

2. Find the function value f\ — /(X i). 

3. Set the iteration number as i = 1 . 

4. Generate a set of n random numbers r\, r 2 , ... , r n each lying in the interval 
[—1, 1] and formulate the unit vector u as 


r 1 
ri 


U = 


(r 2 + r| -( b r 2 ) 1 / 2 


(6.19) 


The directions generated using Eq. (6.19) are expected to have a bias toward 
the diagonals of the unit hypercube [6.3]. To avoid such a bias, the length 
of the vector, R, is computed as 

R = (r 2 + r 2 + • • • + r 2 ) 1 / 2 


and the random numbers generated (r \ , rj, ... ,r n ) are accepted only if R < 1 
but are discarded if R > 1 . If the random numbers are accepted, the unbiased 
random vector U, is given by Eq. (6.19). 

5. Compute the new vector and the corresponding function value as X = X 1 + All 
and / = /(X). 

6. Compare the values of / and f\ . If / < f\ , set the new values as X 1 = X and 
fi = f and go to step 3. If / > /], go to step 7. 

7. If i < N, set the new iteration number as i = i + 1 and go to step 4. On the 
other hand, if i > N, go to step 8. 

8. Compute the new, reduced, step length as A = A/2. If the new step length is 
smaller than or equal to e, go to step 9. Otherwise (i.e., if the new step length 
is greater than e), go to step 4. 

9. Stop the procedure by taking X opt ~ X 1 and / opt ~ J\ . 

This method is illustrated with the following example. 


Example 6.3 Minimize f(x 1 , xi) = x\ — X2 + 2x 2 + 2 x\Xi + x\ using random walk 
method from the point X; = {[] []} with a starting step length of A = 1.0. Take e = 0.05 
and N = 100. 
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Table 6.2 Minimization of / by Random Walk Method 


Step 

length, 

X 

Number of 
trials 
required" 

Components of X i + All 
1 2 

Current objective 
function value, 
f\ = /(Xi + Au) 

1.0 

1 


-0.93696 

0.34943 

-0.06329 

1.0 

2 


-1.15271 

1.32588 

-1.11986 



Next 

100 trials did not reduce the function value. 


0.5 

1 


-1.34361 

1.78800 

-1.12884 

0.5 

3 


-1.07318 

1.36744 

-1.20232 



Next 

100 trials did not reduce the function value. 


0.25 

4 


-0.86419 

1.23025 

-1.21362 

0.25 

2 


-0.86955 

1.48019 

-1.22074 

0.25 

8 


-1.10661 

1.55958 

-1.23642 

0.25 

30 


-0.94278 

1.37074 

-1.24154 

0.25 

6 


-1.08729 

1.57474 

-1.24222 

0.25 

50 


-0.92606 

1.38368 

-1.24274 

0.25 

23 


-1.07912 

1.58135 

-1.24374 



Next 

100 trials did not reduce the function value. 


0.125 

1 


-0.97986 

1.50538 

-1.24894 



Next 

100 trials did not reduce the function value. 


0.0625 

100 trials did not reduce the function value. 


0.03125 

As this step length is smaller than e, the program is terminated. 


“Out of the directions generated that satisfy R < 1, number of trials required to find a direction that also 
reduces the value of /. 


SOLUTION The results are summarized in Table 6.2, where only the trials that pro- 
duced an improvement are shown. 


6.2.3 Random Walk Method with Direction Exploitation 

In the random walk method described in Section 6.2.2, we proceed to generate a new 
unit random vector U )+ i as soon as we find that u, is successful in reducing the function 
value for a fixed step length A. However, we can expect to achieve a further decrease 
in the function value by taking a longer step length along the direction u,. Thus the 
random walk method can be improved if the maximum possible step is taken along 
each successful direction. This can be achieved by using any of the one-dimensional 
minimization methods discussed in Chapter 5. According to this procedure, the new 
vector X, + i is found as 

X/ +1 =X,+A*U/ (6.20) 

where X* is the optimal step length found along the direction u, so that 

fi+i = /(X; + A.* U/) = min/ (X, + A/U,-) (6.21) 

The search method incorporating this feature is called the random walk method with 
direction exploitation . 
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6.2.4 Advantages of Random Search Methods 

1. These methods can work even if the objective function is discontinuous and 
nondifferentiabie at some of the points. 

2. The random methods can be used to hnd the global minimum when the objective 
function possesses several relative minima. 

3. These methods are applicable when other methods fail due to local difficulties 
such as sharply varying functions and shallow regions. 

4. Although the random methods are not very efficient by themselves, they can be 
used in the early stages of optimization to detect the region where the global 
minimum is likely to be found. Once this region is found, some of the more effi- 
cient techniques can be used to find the precise location of the global minimum 
point. 


6.3 GRID SEARCH METHOD 

This method involves setting up a suitable grid in the design space, evaluating the 
objective function at all the gird points, and finding the grid point corresponding to 
the lowest function value. For example, if the lower and upper bounds on the ;th 
design variable are known to be /, and h, , respectively, we can divide the range (/, , u ,■) 
into pi — 1 equal parts so that x- 1 ', x- 2 \ . . . , x\ p,) denote the grid points along the Xj 
axis (i = 1,2,..., n). This leads to a total of p\p 2 ■ ■ ■ Pn grid points in the design 
space. A grid with p,- = 4 is shown in a two-dimensional design space in Fig. 6.4. The 
grid points can also be chosen based on methods of experimental design [6.4, 6.5]. 
It can be seen that the grid method requires prohibitively large number of function 
evaluations in most practical problems. For example, for a problem with 10 design 
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Figure 6.4 Grid with p, = 4. 
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variables ( n = 10), the number of grid points will be 3 10 = 59,049 with p, = 3 and 
4 10 = 1,048,576 with p, — 4. However, for problems with a small number of design 
variables, the grid method can be used conveniently to find an approximate minimum. 
Also, the grid method can be used to find a good starting point for one of the more 
efficient methods. 

6.4 UNIVARIATE METHOD 

In this method we change only one variable at a time and seek to produce a sequence 
of improved approximations to the minimum point. By starting at a base point X , in the 
ith iteration, we fix the values of n — 1 variables and vary the remaining variable. Since 
only one variable is changed, the problem becomes a one-dimensional minimization 
problem and any of the methods discussed in Chapter 5 can be used to produce a new 
base point X,-+i. The search is now continued in a new direction. This new direction 
is obtained by changing any one of the n — 1 variables that were fixed in the previous 
iteration. In fact, the search procedure is continued by taking each coordinate direction 
in turn. After all the n directions are searched sequentially, the first cycle is complete 
and hence we repeat the entire process of sequential minimization. The procedure is 
continued until no further improvement is possible in the objective function in any of 
the n directions of a cycle. The univariate method can be summarized as follows: 


Choose an arbitrary staring point X i and set i = 1 . 


Find the search direction S, as 





(1,0,0, . 

.,0) 

for 

= 1 , n + 1 , 2n + 1 , . . . 



(1,0,0,. 

.,0) 

for 

= 2, n + 2, 2 n + 2, . . . 


s ] = ■ 

(0,0, 1, . 

.,0) 

for 

= 3, n + 3, 2 n + 3, . . . 

(6.22) 


(0,0,0, . 

.,1) 

for i 

= n,2n,3n, . . . 


Determine whether A.,- should be positive 

or negative. For the current direction 


Si, this means find whether the function value decreases in the positive or 
negative direction. For this we take a small probe length (e) and evaluate /■ = 
/(Xi), /+ - f(Xi + eSi), and f~ = /(X; - eS,). If /+ < /, S, will be the 
correct direction for decreasing the value of / and if f~ < ft, — S; will be the 
correct one. If both f + and f~ are greater than /■, we take X, as the minimum 
along the direction S, . 

4. Find the optimal step length /,* such that 

/(X,- ± A*S,) - min(Xi ± A,S ,-) (6.23) 

where + or — sign has to be used depending upon whether S, or — S, is the 
direction for decreasing the function value. 

5. Set X j_|_i = X,- ± a* S i depending on the direction for decreasing the function 
value, and f i+i = /(X, +] ). 

6 . Set the new value of i = i + 1 and go to step 2. Continue this procedure until 
no significant change is achieved in the value of the objective function. 
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The univariate method is very simple and can be implemented easily. However, 
it will not converge rapidly to the optimum solution, as it has a tendency to oscil- 
late with steadily decreasing progress toward the optimum. Hence it will be better to 
stop the computations at some point near to the optimum point rather than trying to 
find the precise optimum point. In theory, the univariate method can be applied to find 
the minimum of any function that possesses continuous derivatives. However, if the 
function has a steep valley, the method may not even converge. For example, consider 
the contours of a function of two variables with a valley as shown in Fig. 6.5. If the 
univariate search starts at point P, the function value cannot be decreased either in 
the direction ±S] or in the direction ±S 2 - Thus the search comes to a halt and one 
may be misled to take the point P, which is certainly not the optimum point, as the 
optimum point. This situation arises whenever the value of the probe length s needed 
for detecting the proper direction (d=S i or ±S 2 ) happens to be less than the number of 
significant figures used in the computations. 

Example 6.4 Minimize f(x\, xi) — M — *2 + + 2x\X2 + x\ with the starting 

point (0, 0). 

SOLUTION We will take the probe length (e) as 0.01 to find the correct direction for 
decreasing the function value in step 3. Further, we will use the differential calculus 
method to find the optimum step length /,* along the direction ±S,- in step 4. 

Iteration i = 1 

Step 2: Choose the search direction Si as Si = {^}. 



Figure 6.5 Failure of the univariate method on a steep valley. 
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Step 3: To find whether the value of / decreases along Si or —Si, we use the probe 
length e. Since 

/i = /(Xi) = /(0,0)=0, 

/+ = /(Xi +eSi) = /(e,0) = 0.01 -0 + 2(0.0001) 

+ 0 + 0 = 0.0102 >/i 

f - = /(X i — eSi) = /(— e, 0) = -0.01 -0 + 2(0.0001) 


+ 0 + 0 = -0.0098 < /i, 

— Si is the correct direction for minimizing / from X]. 

Step 4: To find the optimum step length A*, we minimize 

/(Xi -AiSi) = /(—Xi, 0) 

= (-A.0 - 0 + 2(-Ai) 2 + 0 + 0 = 2k\ - A, 

As df/dX i = 0 at A.i = ^, we have Aj = 

.V/e/t 5: Set 


X 2 = Xi — A*Si = 

/ 2 = /(X 2 ) = /(-i,0) = -i. 


j°l 

1-1 (M 

i-l 

[ -a 

ioj 

4 }o 

I'i 

i«i 


Iteration i = 2 


Step 2: Choose the search direction S 2 as S 2 = { ^ } . 

Step 3: Since f 2 = /(X 2 ) = -0.125, 

/+ = /(X 2 + eS 2 ) = / (—0.25, 0.01) = -0.1399 < f 2 

f~ = /(X 2 + eS 2 ) = / (—0.25, -0.01) = -0.1099 > f 2 

S 2 is the correct direction for decreasing the value of / from X 2 . 
Step 4: We minimize /(X 2 + A. 2 S 2 ) to find 
Here 


Step 5: Set 


/ (X 2 + A 2 S 2 ) = / (—0.25, A 2 ) 

= -0.25 - A 2 + 2(0. 25) 2 - 2(0.25) (A 2 ) + k\ 
= k\- 1.5A 2 - 0.125 
df 

— =2A 2 - 1.5=0 at A? =0.75 
dk 2 2 


Xi =X, 


■ a 2 S 2 = 


f —0.251 

, me [0 

[—0.25 

1 0 1 

+ 0.75 1 1 

“ { 0.75 


h = /(X 3 ) = -0.6875 
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Next we set the iteration number as i =3, and continue the procedure until the optimum 
solution X* = { [ 5 } with /(X*) = —1.25 is found. 

Note: If the method is to be computerized, a suitable convergence criterion has to 
be used to test the point X (+ | (/ = 1 , 2, . . .) for optimality. 


6.5 PATTERN DIRECTIONS 

In the univariate method, we search for the minimum along directions parallel to the 
coordinate axes. We noticed that this method may not converge in some cases, and that 
even if it converges, its convergence will be very slow as we approach the optimum 
point. These problems can be avoided by changing the directions of search in a favorable 
manner instead of retaining them always parallel to the coordinate axes. To understand 
this idea, consider the contours of the function shown in Fig. 6 . 6 . Let the points 
1,2, 3, ... indicate the successive points found by the univariate method. It can be 
noticed that the lines joining the alternate points of the search (e.g., 1, 3; 2, 4; 3, 5; 4, 
6 ; . . .) lie in the general direction of the minimum and are known as pattern directions. 
It can be proved that if the objective function is a quadratic in two variables, all such 
lines pass through the minimum. Unfortunately, this property will not be valid for 
multivariable functions even when they are quadratics. However, this idea can still 
be used to achieve rapid convergence while finding the minimum of an n -variable 
function. Methods that use pattern directions as search directions are known as pattern 
search methods . 

One of the best-known pattern search methods, the Powell’s method, is discussed 
in Section 6 . 6 . In general, a pattern search method takes n univariate steps, where n 



Figure 6.6 Lines defined by the alternate points lie in the general direction of the minimum. 
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denotes the number of design variables and then searches for the minimum along the 
pattern direction S defined by 


S, = X, - X,_ n (6.24) 

where X, is the point obtained at the end of n univariate steps and X, „ is the starting 
point before taking the n univariate steps. In general, the directions used prior to taking 
a move along a pattern direction need not be univariate directions. 


6.6 POWELL'S METHOD 

Powell’s method is an extension of the basic pattern search method. It is the most 
widely used direct search method and can be proved to be a method of conjugate 
directions [6.7]. A conjugate directions method will minimize a quadratic function in 
a finite number of steps. Since a general nonlinear function can be approximated rea- 
sonably well by a quadratic function near its minimum, a conjugate directions method 
is expected to speed up the convergence of even general nonlinear objective functions. 
The definition, a method of generation of conjugate directions, and the property of 
quadratic convergence are presented in this section. 


6.6.1 Conjugate Directions 

Definition: Conjugate Directions. Let A = [A] be an n x n symmetric matrix. A set 
of n vectors (or directions) {S, } is said to be conjugate (more accurately A -conjugate) if 

Sj ASj — 0 for all i j, i = 1, 2, . . . , n, j = 1, 2, . . . , n (6.25) 

It can be seen that orthogonal directions are a special case of conjugate directions 
(obtained with [A] = [/] in Eq. (6.25)). 

Definition: Quadratically Convergent Method. If a minimization method, using 
exact arithmetic, can find the minimum point in n steps while minimizing a quadratic 
function in n variables, the method is called a quadratically convergent method . 

Theorem 6.1 Given a quadratic function of n variables and two parallel hyperplanes 
1 and 2 of dimension k < n. Let the constrained stationary points of the quadratic 
function in the hyperplanes be X] and X 2 , respectively. Then the line joining X] and 
X 2 is conjugate to any line parallel to the hyperplanes. 

Proof : Let the quadratic function be expressed as 

g(X) = ^X T AX +B T X +C (6.26) 

The gradient of Q is given by 


V0(X) = AX + B 
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and hence 

Vfi(Xi)-Vfi(X 2 )=A(X 1 -X 2 ) (6.27) 

If S is any vector parallel to the hyperplanes, it must be orthogonal to the gradients 
Vg(X,) and VQ(X 2 )- Thus 

S t VG(X!) = S T AXj + S T B = 0 (6.28) 

S t VG(X 2 ) = S t AX 2 + S t B =0 (6.29) 

By subtracting Eq. (6.29) from Eq. (6.28), we obtain 

S r A(Xj - X 2 ) = 0 (6.30) 

Hence S and (Xi — X?) are A -conjugate. 

The meaning of this theorem is illustrated in a two-dimensional space in Fig. 6.7. 
If X i and X 2 are the minima of Q obtained by searching along the direction S from two 



Xt 


X 


1 


Figure 6.7 Conjugate directions. 
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different starting points X ( , and X/,, respectively, the line (X i — X 2 ) will be conjugate 
to the search direction S. 

Theorem 6.2 If a quadratic function 

g(X) = ±X T AX +B T X +C (6.31) 

is minimized sequentially, once along each direction of a set of n mutually conjugate 
directions, the minimum of the function Q will be found at or before the nth step 
irrespective of the starting point. 

Proof: Let X* minimize the quadratic function <2(X). Then 

VQ(X*) = B + AX* = 0 (6.32) 

Given a point X i and a set of linearly independent directions Si , S 2 , . . . , S„, constants 
fi can always be found such that 

n 

X * = X 1 + ^2 Pi S, (6.33) 

i = 1 

where the vectors Si, S 2 , . . . , S„ have been used as basis vectors. If the directions S,- 
are A -conjugate and none of them is zero, the S,- can easily be shown to be linearly 
independent and the fy can be determined as follows. 

Equations (6.32) and (6.33) lead to 

B+AXi+A =° (6.34) 

Multiplying this equation throughout by SJ, we obtain 

S ; r (B+AX0 + SjA — 0 (6-35) 


Equation (6.35) can be rewritten as 

(B + AXj) t S j + fjSjASj = 0 (6.36) 


that is, 


Pi - - 


(B+AXpTS,- 

SJAS; 


(6.37) 


Now consider an iterative minimization procedure starting at point X 1 , and successively 
minimizing the quadratic <2(X) in the directions S 1 . So, . . . , S„, where these directions 
satisfy Eq. (6.25). The successive points are determined by the relation 


X 


i+i 


= X,-+A* s 


i ? 


i — 1 to n 


(6.38) 
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where X* is found by minimizing <2(X, + a,S,) so that + 


sTve(x /+1 ) = 0 


Since the gradient of Q at the point X,-+i is given by 


V<2(X;+i) = B + AX, +1 


Eq. (6.39) can be written as 


ST{B+A(X f +^S,-)} = 0 


This equation gives 


A* = - 

From Eq. (6.38), we can express X,- as 


(B+AX,-) T S, 

SfAS, 


i - 1 

x, = x 1 + J>;s, 

7=1 


so that 


i—i 


X /AS, = X/AS, +J2 k *i SJ i AS ‘ 

7=1 


= X[AS, 


(6.39) 

(6.40) 

(6.41) 

(6.42) 

(6.43) 


(6.44) 


using the relation (6.25). Thus Eq. (6.42) becomes 

a; = -(B+AX!) t -^- (6.45) 

Sf A S, 

which can be seen to be identical to Eq. (6.37). Hence the minimizing step lengths are 
given by pi or X*. Since the optimal point X* is originally expressed as a sum of n 
quantities Pi, @ 2 , ■ ■ ■ , Pn, which have been shown to be equivalent to the minimizing 
step lengths, the minimization process leads to the minimum point in n steps or less. 
Since we have not made any assumption regarding X i and the order of Si, S2 , . . . , S„, 
the process converges in n steps or less, independent of the starting point as well as 
the order in which the minimization directions are used. 


f S, r V(2(X /+ i) = 0 is equivalent to dQ/dX.j = 0 at Y = X, + i: 


dQ_ = ^dQ_ayi 

dk j dyi dXi 

j = 1 J 


where yj are the components of Y = X,-+i. 
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Example 6.5 Consider the minimization of the function 

/(x 1, X2 ) = 6 x\ + 2 x\ — 6x1X2 — XI — 2x2 

If Si = {'} denotes a search direction, find a direction S 2 that is conjugate to the 
direction Si. 


SOLUTION The objective function can be expressed in matrix form as 


/(X) = B r X + ix T [A]X 


1 


+ 1 x 2 ] 


= {-l -2} 

and the Hessian matrix [A] can be identified as 

[A] = 

The direction S 2 = {**} will be conjugate to Si = {1} if 


'12 —6 

Jxi 

-6 4 

ix 2 


12 -6 
-6 4 


S[[A]S 2 = (1 2) 


1 

<N 

(V 

1 

VO 

1 

1 

W 


= 0 


which upon expansion gives ls 2 — 0 or \| = arbitrary and s 2 — 0. Since ,v 1 can have 
any value, we select sq = 1 and the desired conjugate direction can be expressed as 



6.6.2 Algorithm 

The basic idea of Powell’s method is illustrated graphically for a two- variable func- 
tion in Fig. 6.8. In this figure the function is first minimized once along each of the 
coordinate directions starting with the second coordinate direction and then in the cor- 
responding pattern direction. This leads to point 5. For the next cycle of minimization, 
we discard one of the coordinate directions (the xi direction in the present case) in 
favor of the pattern direction. Thus we minimize along U 2 and Si and obtain point 7. 
Then we generate a new pattern direction S 2 as shown in the figure. For the next 
cycle of minimization, we discard one of the previously used coordinate directions 
(the X 2 direction in this case) in favor of the newly generated pattern direction. Then, 
by starting from point 8, we minimize along directions Si and S2, thereby obtaining 
points 9 and 10, respectively. For the next cycle of minimization, since there is no 
coordinate direction to discard, we restart the whole procedure by minimizing along 
the X 2 direction. This procedure is continued until the desired minimum point is found. 

The flow diagram for the version of Powell’s method described above is given 
in Fig. 6.9. Note that the search will be made sequentially in the directions S„; 

Si, S 2 , S 3 , ... , S„_j, S„; Sp); S 2 , S 3 , . . . , S„_i, S„, S«; S®; S 3 , S 4 , . . . , S„_ x , S,„ 

Sp\ Sp 2) ; Sp \ ... until the minimum point is found. Here S, indicates the coordi- 
nate direction u, and S ( / / ) the jth pattern direction. In Fig. 6.9, the previous base point 


324 


Nonlinear Programming II: Unconstrained Optimization Techniques 



Figure 6.8 Progress of Powell’s method. 


is stored as the vector Z in block A, and the pattern direction is constructed by sub- 
tracting the previous base point from the current one in block B. The pattern direction 
is then used as a minimization direction in blocks C and D. For the next cycle, the 
first direction used in the previous cycle is discarded in favor of the current pattern 
direction. This is achieved by updating the numbers of the search directions as shown 
in block E. Thus both points Z and X used in block B for the construction of pattern 
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Figure 6.9 Flowchart for Powell’s Method. 


direction are points that are minima along S„ in the first cycle, the first pattern direction 
Sj, 1 1 in the second cycle, the second pattern direction S® in the third cycle, and so on. 

Quadratic Convergence. It can be seen from Fig. 6.9 that the pattern direc- 
tions Sp \ Sp 2) , Sp 3) , . . . are nothing but the lines joining the minima found along 
the directions S„, Sp\ S®, . . ., respectively. Hence by Theorem 6.1, the pairs of 
directions (S„,Sp^), (Sp\ S®), and so on, are A-conjugate. Thus all the directions 
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S„, Sp l) , S { p \ ... are A -conjugate. Since, by Theorem 6.2, any search method involv- 
ing minimization along a set of conjugate directions is quadratically convergent, 
Powell's method is quadratically convergent. From the method used for construct- 
ing the conjugate directions Sp\ S®, . . ., we find that n minimization cycles are 
required to complete the construction of n conjugate directions. In the /th cycle, 
the minimization is done along the already constructed i conjugate directions and 
the n — i nonconjugate (coordinate) directions. Thus after n cycles, all the n search 
directions are mutually conjugate and a quadratic will theoretically be minimized in 
n 2 one-dimensional minimizations. This proves the quadratic convergence of Powell's 
method. 

It is to be noted that as with most of the numerical techniques, the convergence in 
many practical problems may not be as good as the theory seems to indicate. Powell's 
method may require a lot more iterations to minimize a function than the theoretically 
estimated number. There are several reasons for this: 

1. Since the number of cycles n is valid only for quadratic functions, it will take 
generally greater than n cycles for nonquadratic functions. 

2. The proof of quadratic convergence has been established with the assumption 
that the exact minimum is found in each of the one-dimensional minimizations. 
However, the actual minimizing step lengths X* will be only approximate, and 
hence the subsequent directions will not be conjugate. Thus the method requires 
more number of iterations for achieving the overall convergence. 

3. Powell's method, described above, can break down before the minimum point 
is found. This is because the search directions S, might become dependent or 
almost dependent during numerical computation. 

Convergence Criterion. The convergence criterion one would generally adopt in a 
method such as Powell’s method is to stop the procedure whenever a minimization 
cycle produces a change in all variables less than one-tenth of the required accuracy. 
However, a more elaborate convergence criterion, which is more likely to prevent 
premature termination of the process, was given by Powell [6.7]. 

Example 6.6 Minimize f(x i, * 2 ) = x\ — *2 + 2x\ + 2x \ xi + x? from the starting 
point Xi = {[[} using Powell’s method. 

SOLUTION 

Cycle 1: Univariate Search 

We minimize / along S 2 = S„ = j " } from X ] . To find the correct direction (+S 2 
or — So) for decreasing the value of /, we take the probe length as e =0.01. As 
/1 = /(X 1 ) = 0.0, and 

/+ = /(X 1 + eS 2 ) = /( 0.0, 0.01) = -0.0099 < f\ 

f decreases along the direction +S 2 . To find the minimizing step length X* along S 2 , 
we minimize 


/(Xj +k S 2 ) = / (0.0, X) — X 2 — X 
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As df/dX — 0 at X* — j, we have X2 = Xi + A*S 2 — |q 5 • 

Next we minimize / along Si = {,'J from X 3 = {Hf,}- Since 

h = /(X 2 ) = /(O.o, 0.5) = -0.25 

/+ = /(X 2 + eSi) = /(0.01, 0.50) = -0.2298 > f 2 

f~ = /(X 2 - eSi) = /(— 0.01 , 0.50) = -0.2698 

/ decreases along — Sj. As /(X 2 — ASi) = /(—A, 0.50) = 2A 2 — 2X— 0.25, df/dX = 
0 at k* = i. Hence X 3 = X 2 - X*S { = 

Now we minimize / along S2 = j^} from X3 = { 95}. As / 3 = /(X 3 ) = —0.75, 
/+ = /(X 3 + eS 2 ) = /(— 0.5, 0.51) = —0.7599 < / 3 , / decreases along +S 2 direc- 
tion. Since 

/(X 3 + AS 2 ) = /(— 0.5, 0.5 + A) = A 2 - A - 0.75, — = 0 at A* = — 

dX 2 

This gives 

x 4 = x 3 + rs 2 = 

Cycle 2: Pattern Search 

Now we generate the first pattern direction as 



and minimize / along Sp * from X4. Since 
/ 4 = /(X 4 ) = -1.0 

/+ = /(X 4 + eS^) = /(— 0.5 - 0.005, 1 + 0.005) 
= /(— 0.505, 1.005) = -1.004975 


/ decreases in the positive direction of Sn \ As 


/(X 4 + AS^) = /(— 0.5 - 0.5A, 1.0 + 0.5A) 
= 0.25A 2 — 0.50A — 1.00, 


df 

— = 0 at X* — 1.0 and hence 
dX 


x 5 = x 4 + rs ( D 1) = 



The point X 5 can be identified to be the optimum point. 
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If we do not recognize X5 as the optimum point at this stage, we proceed to 
minimize / along the direction S 2 — {^} from X5. Then we would obtain 

h — f (X5) = —1-25, / + = /(X 5 + e S 2 )>/ 5 , 

and / = /(X 5 - eS 2 ) > fs 

This shows that / cannot be minimized along S 2 , and hence X5 will be the optimum 
point. In this example the convergence has been achieved in the second cycle itself. 
This is to be expected in this case, as / is a quadratic function, and the method is a 
quadratically convergent method. 


6.7 SIMPLEX METHOD 

Definition: Simplex. The geometric figure formed by a set of n + 1 points in an 
n -dimensional space is called a simplex. When the points are equidistant, the simplex 
is said to be regular. Thus in two dimensions, the simplex is a triangle, and in three 
dimensions, it is a tetrahedron. 

The basic idea in the simplex method 1 ” is to compare the values of the objective 
function at the n + 1 vertices of a general simplex and move the simplex gradu- 
ally toward the optimum point during the iterative process. The following equations 
can be used to generate the vertices of a regular simplex (equilateral triangle in 
two-dimensional space) of size a in the //-dimensional space [6.10]: 

n 

X,- = X 0 + p\\i + ^2 <l u j’ i = 1. 2, .... n (6.46) 

where 

p — — — (a/« + 1 + n — 1) and q = — —{sfn + T — 1) (6.47) 

n V 2 nv 2 

where Xo is the initial base point and u, is the unit vector along the jth coordinate axis. 
This method was originally given by Spendley, Hext, and Himsworth [6.10] and was 
developed later by Nelder and Mead [6.1 1]. The movement of the simplex is achieved 
by using three operations, known as reflection, contraction, and expansion. 

6.7.1 Reflection 

If X/, is the vertex corresponding to the highest value of the objective function among 
the vertices of a simplex, we can expect the point X,- obtained by reflecting the point 
X /, in the opposite face to have the smallest value. If this is the case, we can construct 
a new simplex by rejecting the point X/, from the simplex and including the new point 
X, . This process is illustrated in Fig. 6.10. In Fig. 6.10 a, the points Xi, X 2 , and X3 
form the original simplex, and the points Xi, X 2 , and X,- form the new one. Similarly, 
in Fig. 6.10 b, the original simplex is given by points X 1, X 2 , X3, and X4, and the new 
one by X !, X 2 , X3, and X, . Again we can construct a new simplex from the present one 


^This simplex method should not be confused with the simplex method of linear programming. 
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Figure 6.10 Reflection. 


by rejecting the vertex corresponding to the highest function value. Since the direction 
of movement of the simplex is always away from the worst result, we will be moving 
in a favorable direction. If the objective function does not have steep valleys, repetitive 
application of the reflection process leads to a zigzag path in the general direction of 
the minimum as shown in Fig. 6.11. Mathematically, the reflected point X,- is given 
by 


X r = (1 + a)X o — crX/, (6.48) 

where X/, is the vertex corresponding to the maximum function value: 

/ (X/j) = max /(X,), (6.49) 

1=1 to n +1 



Figure 6.11 Progress of the reflection process. 
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Xo is the centroid of all the points X, except i — h: 

1 n+1 

Xo = -VX, (6.50) 

n ' 

i = 1 
i^h 

and a > 0 is the reflection coefficient defined as 

distance between X,- and Xo 
distance between X/, and X (l 

Thus X,- will lie on the line joining X/, and Xq, on the far side of Xo from X/, with 
|X, — Xo | =ct|X/ 1 — Xq|. If /(X,-) lies between /(X/,) and /(X/), where X/ is the 
vertex corresponding to the minimum function value, 

/(X,) = min /(X,-) (6.52) 

1=1 to n+l 

X/, is replaced by X, and a new simplex is started. 

If we use only the reflection process for finding the minimum, we may encounter 
certain difficulties in some cases. For example, if one of the simplexes (triangles in 
two dimensions) straddles a valley as shown in Fig. 6.12 and if the reflected point X, 
happens to have an objective function value equal to that of the point X/,, we will 
enter into a closed cycle of operations. Thus if X2 is the worst point in the simplex 
defined by the vertices Xi, X2, and X3, the reflection process gives the new simplex 
with vertices Xi, X3, and X r . Again, since X, has the highest function value out of 
the vertices Xi, X3, and X r , we obtain the old simplex itself by using the reflection 
process. Thus the optimization process is stranded over the valley and there is no way 
of moving toward the optimum point. This trouble can be overcome by making a rule 
that no return can be made to points that have just been left. 
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Whenever such situation is encountered, we reject the vertex corresponding to the 
second worst value instead of the vertex corresponding to the worst function value. 
This method, in general, leads the process to continue toward the region of the desired 
minimum. However, the final simplex may again straddle the minimum, or it may lie 
within a distance of the order of its own size from the minimum. In such cases it may 
not be possible to obtain a new simplex with vertices closer to the minimum compared 
to those of the previous simplex, and the pattern may lead to a cyclic process, as shown 
in Fig. 6.13. In this example the successive simplexes formed from the simplex 123 
are 234, 245, 456, 467, 478, 348, 234, 245, . . ., t which can be seen to be forming 
a cyclic process. Whenever this type of cycling is observed, one can take the vertex 
that is occurring in every simplex (point 4 in Fig. 6.13) as the best approximation to 
the optimum point. If more accuracy is desired, the simplex has to be contracted or 
reduced in size, as indicated later. 


6.7.2 Expansion 

If a reflection process gives a point X,. for which /(X, ) < /(X/), (i.e., if the reflection 
produces a new minimum), one can generally expect to decrease the function value 
further by moving along the direction pointing from Xq to X, . Hence we expand X r 



Figure 6.13 Reflection process leading to a cyclic process. 


^Simplexes 456, 467, and 234 are formed by reflecting the second-worst point to avoid the difficulty 
mentioned earlier. 
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toX, using the relation 


X e — yX r + (1 — k)X 0 


(6.53) 


where y is called the expansion coefficient, defined as 

distance between X, and Xo 

y — > 1 

distance between X, and Xo 

If /(X (’ ) '■ f (X/), we replace the point X/$ by X , and restart the process of reflec - 
tion. On the other hand, if f(X e ) > /(X/), it means that the expansion process is not 
successful and hence we replace point X/, by X,- and start the reflection process again. 


6.7.3 C ontraction 

If the reflection process gives a point X r for which /(X, ) > /(X,) for all i except 
i — h, and /(X r ) < /(X/,), we replace point X/, by X,-. Thus the new X/, will be X,-. 
In this case we contract the simplex as follows: 

X c = px h + (1 - £)X 0 (6.54) 

where ft is called the contraction coefficient (0 < f> < 1) and is defined as 

distance between X, and Xo 
distance between X/, and Xo 

If /(X,-) > /(X/,), we still use Eq. (6.54) without changing the previous point X/,. If 
the contraction process produces a point X, for which /(X c ) < min[/(X/ ; ), /(X, )], we 
replace the point X/, in Xi, Xt. . . . , X„ + i by X r and proceed with the reflection process 
again. On the other hand, if /(X c ) > min[/(X/j), /(X, )], the contraction process will 
be a failure, and in this case we replace all X, by (X, + X/)/2 and restart the reflection 
process. 

The method is assumed to have converged whenever the standard deviation of the 
function at the n + 1 vertices of the current simplex is smaller than some prescribed 
small quantity e, that is, 


Q = 



[/(X,) - /(X o)] 2 
n + 1 


1/2 


< £ 


(6.55) 


Example 6.7 Minimize f(x \, xf) = x\ — X 2 + 2x\ +2x\X2 + x\. Take the points 
defining the initial simplex as 


X, = 


{4.0 

v [5.0 

[4.0 

’ Xs — [4.0 


and X: 


1 4.0 
[5.0 


and a — 1.0, ft — 0.5, and y — 2.0. For convergence, take the value of e as 0.2. 
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SOLUTION 
Iteration 1 


Step 1: The function value at each of the vertices of the current simplex is given by 

fi = /(X r) = 4.0 - 4.0 + 2(16.0) + 2(16.0) + 16.0 = 80.0 

h = /(X 2 ) = 5.0 - 4.0 + 2(25.0) + 2(20.0) + 16.0 = 107.0 

/ 3 = /(X 3 ) = 4.0 - 5.0 + 2(16.0) + 2(20.0) + 25.0 = 96.0 

Therefore, 


X,, - X 2 = 


X/ = X i - 


5.0 

4.0 

4.0 

4.0 


f(X h ) = 107.0, 
and /(X/) = 80.0 


Step 2: The centroid X () is obtained as 


1 


X 0 = -(X 1+ X 3 ) = - 


J 4.0 + 4.0 

[4.0 

[4.0 + 5.0 

- [4.5 j 


with /(X 0 ) = 87.75 


[8.0 

[5.0 

[3.0 

[9.0 

[4.0 

“ [5.0 


Step 3: The reflection point is found as 

X , — 2 X o — X , — 

Then 

f(X r ) = 3.0 - 5.0 + 2(9.0) + 2(15.0) + 25.0 = 71.0 
Step 4: As f(X r ) < /(X/), we find X e by expansion as 

X e — 2X, — X 0 = 

Then 

f(X e ) = 2.0 - 5.5 + 2(4.0) + 2(1 1.0) + 30.25 = 56.75 

Step 5: Since f(X e ) < /(X/), we replace X/, by X, and obtain the vertices of the new 
simplex as 


[ 6.0 

[4.0 

[2.0 

[10.0 

1 4-5 J 

“ [5.5 


X, - 


Step 6: To test for convergence, we compute 


[4.0 

V [2.0] 

[4.0 

[' X!= |5.5] 


and X 3 = 


4.0 

5.0 


Q = 


(80.0 - 87. 75) 2 + (56.75 - 87.75) 2 + (96.0 - 87.75) 2 


2-1 1/2 


19.06 


As this quantity is not smaller than e, we go to the next iteration. 
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Iteration 2 


Step I: As /(X t ) = 80.0, /(X 2 ) = 56.75, and /(X 3 ) =96.0, 


x„ =x,= 


4.0 

5.0 


and X/ = X 2 = 


2.0 

5.5 


Step 2: The centroid is 


X 0 = -(X 1+ X 2 ) = 


(4.0 + 2.0 

(3.0 

[4.0 + 5.5 

- [4.75 


/(X 0 ) = 67.31 


Step 3: 


[6.0 

1 4.0 

[2.0 

19-5 J 

[5.0 

- {4.5 j 


X , — 2 X 0 — X/, 

/(X,) = 2.0 - 4.5 + 2(4.0) + 2(9.0) + 20.25 = 43.75 


Step 4: As /(X,-) < /(X/), we find X f , as 

X e — 2X, — X 0 
/(X e ) = 1.0 -4.25 + 2(1.0) + 2(4.25) + 18.0625 = 25.3125 


(4.0 

(3.0 

(1.0 

[9.0 

[4.75 

[4.25 


Step 5: As /' ( X ,, ) < /(X/), we replace X/, by X ( and obtain the new vertices as 


X, = 


(4.0 

V [2-0] 

[4.0 

[• X2= W 


and X 2 = 


1.0 

4.25 


Step 6: For convergence, we compute Q as 


Q = 


(80.0 - 67.3 1 ) 2 + (56.75 - 67.31) 2 + (25.3125 - 67.31) 

3 


2 "I l/ 2 


= 26.1 


Since Q > e, we go to the next iteration. 

This procedure can be continued until the specified convergence is satisfied. When 
the convergence is satisfied, the centroid Xo of the latest simplex can be taken as the 
optimum point. 
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Indirect Search (Descent) Methods 


6.8 GRADIENT OF A FUNCTION 

The gradient of a function is an n -component vector given by 


df/dx\ 
df/dx 2 
V/= . 

hxi : 

df/dx n 


(6.56) 


The gradient has a very important property. If we move along the gradient direction 
from any point in n -dimensional space, the function value increases at the fastest rate. 
Hence the gradient direction is called the direction of steepest ascent . Unfortunately, the 
direction of steepest ascent is a local property and not a global one. This is illustrated 
in Fig. 6.14, where the gradient vectors V/ evaluated at points 1, 2, 3, and 4 lie along 
the directions 11', 22', 33', and 44', respectively. Thus the function value increases at 
the fastest rate in the direction 1 1' at point 1, but not at point 2. Similarly, the function 
value increases at the fastest rate in direction 22' (33') at point 2 (3), but not at point 
3 (4). In other words, the direction of steepest ascent generally varies from point to 
point, and if we make infinitely small moves along the direction of steepest ascent, the 
path will be a curved line like the curve 1 -2-3-4 in Fig. 6.14. 



4 1 


Figure 6.14 Steepest ascent directions. 
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Since the gradient vector represents the direction of steepest ascent, the negative 
of the gradient vector denotes the direction of steepest descent. Thus any method that 
makes use of the gradient vector can be expected to give the minimum point faster 
than one that does not make use of the gradient vector. All the descent methods make 
use of the gradient vector, either directly or indirectly, in finding the search directions. 
Before considering the descent methods of minimization, we prove that the gradient 
vector represents the direction of steepest ascent. 

T heorem 6.3 The gradient vector represents the direction of steepest ascent. 


Proof-. Consider an arbitary point X in the n-dimensional space. Let / denote the value 
of the objective function at the point X. Consider a neighboring point X + dX with 


dX = 


dx i 
dx 2 


dx n 


(6.57) 


where dx \ , dx 2 , . . . , dx n represent the components of the vector dX. The magnitude 
of the vector dX , ds, is given by 


n 

dX T dX = (ds) 2 = Y, C dxi ) 2 (6.58) 

(=i 


If / + df denotes the value of the objective function at X + dX . the change in /, df, 
associated with dX can be expressed as 

n f 

df = Y' —dxi = V/ r dX (6.59) 

tr dx ‘ 


If u denotes the unit vector along the direction dX and ds the length of dX. we can 
write 


dX — U ds 


(6.60) 


The rate of change of the function with respect to the step length ds is given by 
Eq. (6.59) as 


df_ _ y' 9/ dxi_ _ YfT d * _ 
ds t— 1 dx; ds ds 

i= 1 


(6.61) 


The value of df/ds will be different for different directions and we are interested in 
finding the particular step dX along which the value of df/ds will be maximum. This 
will give the direction of steepest ascent. 1 ” By using the definition of the dot product, 

Un general, if df/ds = V/ r U > 0 along a vector dX, it is called a direction of ascent, and if df/ds < 0, 
it is called a direction of descent. 
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Eq. (6.61) can be rewritten as 

-f- — II V/ 1| ||u|| cos 0 (6.62) 

as 

where ||V/|| and ||u|| denote the lengths of the vectors V/ and u, respectively, and 0 
indicates the angle between the vectors V/ and u. It can be seen that df/ds will be 
maximum when 0=0° and minimum when 0 = 180°. This indicates that the function 
value increases at a maximum rate in the direction of the gradient (i.e., when u is 
along V/). 


T heorem 6.4 The maximum rate of change of / at any point X is equal to the mag- 
nitude of the gradient vector at the same point. 


Proof: The rate of change of the function / with respect to the step length 5 along a 
direction u is given by Eq. (6.62). Since df/ds is maximum when 0=0° and u is a 
unit vector, Eq. (6.62) gives 


which proves the theorem. 



= liv/|| 

max 


6.8.1 E valuation of the G radient 

The evaluation of the gradient requires the computation of the partial derivatives 3//3x,-, 
i = 1,2 There are three situations where the evaluation of the gradient poses 
certain problems: 

1. The function is differentiable at all the points, but the calculation of the com- 
ponents of the gradient, 3//3x,-, is either impractical or impossible. 

2. The expressions for the partial derivatives 3//3x,- can be derived, but they 
require large computational time for evaluation. 

3. The gradient V/ is not defined at all the points. 

In the first case, we can use the forward Unite-difference formula 


3/ _ /(X,„ + Ax,U,) - /(X,„) 

9 Xi x,„ ~~ Ax,- 


i = 1 , 2, . . . , n 


(6.63) 


to approximate the partial derivative 3//3x,- at X,„. If the function value at the base 
point X,„ is known, this formula requires one additional function evaluation to find 
(df/dxj)\ x ,n- Thus it requires n additional function evaluations to evaluate the approxi- 
mate gradient V/ |x m . For better results we can use the central finite difference formula 
to find the approximate partial derivative 3//3x,- |x„, : 


3/ _ /(X„, + Ax t Uj) - /(X„, - Ax,-U,-) 

3 x: y 2 Ax,- 


(6.64) 
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This formula requires two additional function evaluations for each of the partial deriva- 
tives. In Eqs. (6.63) and (6.64), Ax,- is a small scalar quantity and u, is a vector of order 
n whose ith component has a value of 1, and all other components have a value of zero. 
In practical computations, the value of Ax, has to be chosen with some care. If Ax, is 
too small, the difference between the values of the function evaluated at (X,„ + Ax, u, ) 
and (X,„ — Ax, u, ) may be very small and numerical round-off error may predominate. 
On the other hand, if Ax, is too large, the truncation error may predominate in the 
calculation of the gradient. 

In the second case also, the use of finite-difference formulas is preferred whenever 
the exact gradient evaluation requires more computational time than the one involved 
in using Eq. (6.63) or (6.64). 

In the third case, we cannot use the finite-difference formulas since the gradient 
is not defined at all the points. For example, consider the function shown in Fig. 6.15. 
If Eq. (6.64) is used to evaluate the derivative df/ds at X m , we obtain a value of 
a i for a step size Axj and a value of a .2 for a step size Ax 2 - Since, in reality, the 
derivative does not exist at the point X m , use of finite-difference formulas might lead 
to a complete breakdown of the minimization process. In such cases the minimization 
can be done only by one of the direct search techniques discussed earlier. 


In most optimization techniques, we are interested in finding the rate of change of a 
function with respect to a parameter A along a specified direction, S,, away from a 
point X,-. Any point in the specified direction away from the given point X, can be 
expressed as X = X,- + AS,-. Our interest is to find the rate of change of the function 
along the direction S, (characterized by the parameter A), that is, 


6.8.2 Rate of Change of a Function along a Direction 



n 


(6.65) 


/' 



I Ayi 


Figure 6.15 Gradient not defined at x m . 


6.9 Steepest Descent (Cauchy) Method 339 


where xj is the jth component of X. But 



(6.66) 


where jc,y and Sjj are the / th components of X, and S respectively. Hence 



(6.67) 


If X* minimizes / in the direction S,-, we have 



( 6 . 68 ) 


at the point X, + A.*S,-. 


6.9 STEEPEST DESCENT (CAUCHY) METHOD 


The use of the negative of the gradient vector as a direction for minimization was 
first made by Cauchy in 1847 [6.12]. In this method we start from an initial trial 
point X i and iteratively move along the steepest descent directions until the optimum 
point is found. The steepest descent method can be summarized by the following 
steps: 

1. Start with an arbitrary initial point X i . Set the iteration number as i = 1 . 

2. Find the search direction S,- as 


4. Test the new point, X (+ i, for optimality. If X, + i is optimum, stop the process. 
Otherwise, go to step 5. 

5. Set the new iteration number i = i + 1 and go to step 2. 

The method of steepest descent may appear to be the best unconstrained minimization 
technique since each one-dimensional search starts in the “best” direction. However, 
owing to the fact that the steepest descent direction is a local property, the method is 
not really effective in most problems. 

Example 6.8 Minimize f(x i, xn) — x\ — X 2 + 2x\ + 2x \ xi + x\ starting from the 


s, = -v/;- = -v/(X,) 

3. Determine the optimal step length k* in the direction S,- and set 
X,' +1 = X, + X*Si = X,- — X*~V fi 


(6.69) 


(6.70) 


point X i = {[!}. 
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SOLUTION 


Iteration 1 


The gradient of / is given by 


Therefore, 


| df/dx\ | 1 + 4.ri + 2 xt | 

| df/dx2 | — 1 + 2xi + 2x2 j 

v/i = V/(X 0 = J_j 

s,=-v/, = j-; 


To find X 2 , we need to find the optimal step length A*. For this, we minimize /(X 1 + 
A-i Si) = /(—A 1 , Ai) = Aj — 2Ai with respect to Aj. Since df/dX\ = 0 at A* = 1, we 
obtain 


X 2 


Xj+A^Si = 



As V / 2 - V/(X 2 ) = 



, X 2 is not optimum. 


Iteration 2 


To minimize 


S 2 = -V / 2 = jj 
/(X 2 +A 2 S 2 ) = /(-l+A 2 , 1+A 2 ) 


= 5Aj - 2A 2 - 1 

we set df/d \2 — 0. This gives A| = 5 , and hence 


x 3 = x 2 + a*s 2 = 


Since the components of the gradient at X 3 , V / 3 = 
to the next iteration. 


1-1 

1 f 1 

f — 0.8 1 

1 -1 

[ + 5 {lj 

H 12 ] 


0.2 

- 0.2 


, are not zero, we proceed 


Iteration 3 


S 3 = -V / 3 


- 0.2 

0.2 
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As 


/(X 3 + A.3S3) = /(— 0.8 - 0.2^3, 1.2 + 0.2A 3 ) 

7 df 

= 0.04^3 - O.O8A3 - 1 .20, — 

dX 3 

Therefore, 

X4 = X3 + = 

The gradient at X 4 is given by 


0 at a; = 1.0 


f —0.8] 

. n f — 0-2 

f— to] 

1 

+ 10 ( 0.2 

1=1 1 . 4 ] 


V/ 4 = 


| - 0.20 
l - 0.20 


Since V/4 / {|J} , X4 is not optimum and hence we have to proceed to the next iteration. 
This process has to be continued until the optimum point, X* = { j 5}, is found. 


Convergence Criteria: The following criteria can be used to terminate the iterative 
process. 

1. When the change in function value in two consecutive iterations is small: 


/ (X, +1 ) - / (X,-) 
/(X,) 


< £1 


(6.71) 


2. When the partial derivatives (components of the gradient) of / are small: 


9/ 

dxi 


< £2, 


i — 1 , 2, . . . , n 


(6.72) 


3. When the change in the design vector in two consecutive iterations is small: 


|X i+ i — X / 1 < £3 


(6.73) 


6.10 CONJUGATE GRADIENT (FLETC H ER - REE VES) M ETHOD 

The convergence characteristics of the steepest descent method can be improved greatly 
by modifying it into a conjugate gradient method (which can be considered as a con- 
jugate directions method involving the use of the gradient of the function). We saw 
(in Section 6.6.) that any minimization method that makes use of the conjugate direc- 
tions is quadratically convergent. This property of quadratic convergence is very useful 
because it ensures that the method will minimize a quadratic function in n steps or 
less. Since any general function can be approximated reasonably well by a quadratic 
near the optimum point, any quadratically convergent method is expected to find the 
optimum point in a finite number of iterations. 
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We have seen that Powell's conjugate direction method requires n single-variable 
minimizations per iteration and sets up a new conjugate direction at the end of each 
iteration. Thus it requires, in general, n 2 single-variable minimizations to find the mini- 
mum of a quadratic function. On the other hand, if we can evaluate the gradients of the 
objective function, we can set up a new conjugate direction after every one-dimensional 
minimization, and hence we can achieve faster convergence. The construction of con- 
jugate directions and development of the Fletcher-Reeves method are discussed in this 
section. 


6.10.1 Development of the Fletcher- Reeves M ethod 

The Fletcher-Reeves method is developed by modifying the steepest descent method 
to make it quadratically convergent. Starting from an arbitrary point X i , the quadratic 
function 


/(X) = 4X t [A]X + B T X + C (6.74) 

can be minimized by searching along the search direction Si = — V/j (steepest descent 
direction) using the step length (see Problem 6.40): 


* = _sfv / i 

1 SfAS, 


(6.75) 


The second search direction S 2 is found as a linear combination of Si and — V/ 2 : 


s 2 = -v/ 2 + ftSi 


(6.76) 


where the constant fa can be determined by making S 1 and S 2 conjugate with respect 
to [A]. This leads to (see Problem 6.41): 


V/ 2 r V/2 _ V// V/2 
v/7'S, v v fi 


(6.77) 


This process can be continued to obtain the general formula for the z'th search 
direction as 


where 


s, = -v/,- + fcSi-i 


(6.78) 


(6.79) 


Thus the Fletcher-Reeves algorithm can be stated as follows. 
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6.10.2 Fletcher-Reeves Method 


The iterative procedure of Fletcher-Reeves method can be stated as follows: 

1. Start with an arbitrary initial point X i . 

2. Set the first search direction Si = — V/(Xi) = — V/j. 

3. Find the point X 2 according to the relation 


X 2 = X 1 +Aj'Si (6.80) 

where A* is the optimal step length in the direction Si. Set i — 2 and go to the 
next step. 

4. Find V// = V/(X,-), and set 


s, = -v/i + 


ivy;- 1 2 

|V/,-i| 2 


S/_i 


(6.81) 


5. Compute the optimum step length X* in the direction S and find the new point 


X Z+1 = X, + X*S t (6.82) 

6. Test for the optimality of the point X, + ]. If X, + i is optimum, stop the process. 
Otherwise, set the value of i = i + 1 and go to step 4. 

Remarks: 

1. The Fletcher-Reeves method was originally proposed by Hestenes and Stiefel 
[6.14] as a method for solving systems of linear equations derived from the 
stationary conditions of a quadratic. Since the directions S, used in this method 
are A -conjugate, the process should converge in n cycles or less for a quadratic 
function. However, for ill-conditioned quadratics (whose contours are highly 
eccentric and distorted), the method may require much more than n cycles for 
convergence. The reason for this has been found to be the cumulative effect 
of rounding errors. Since S, is given by Eq. (6.81), any error resulting from 
the inaccuracies involved in the determination of X*, and from the round-off 
error involved in accumulating the successive |V/]| 2 S,_i/|V/]_i| 2 terms, is 
carried forward through the vector S,-. Thus the search directions S,- will be 
progressively contaminated by these errors. Hence it is necessary, in practice, 
to restart the method periodically after every, say, m steps by taking the new 
search direction as the steepest descent direction. That is, after every m steps, 
S m+ i is set equal to — V f n+ \ instead of the usual form. Fletcher and Reeves 
have recommended a value of m = n + 1, where n is the number of design 
variables. 

2. Despite the limitations indicated above, the Fletcher-Reeves method is vastly 
superior to the steepest descent method and the pattern search methods, but 
it turns out to be rather less efficient than the Newton and the quasi-Newton 
(variable metric) methods discussed in the latter sections. 

Example 6.9 Minimize f(x i, x 2 ) = x\ — x 2 + 2.r 2 + 2x \ x 2 + x 2 starting from the 
point X! = Q. 
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SOLUTION 


Iteration 1 


_ J df/dx 1 1 _ | 1 + 4x\ + 2x 2 
\df/dx 2 f — 1 + 2xi + 2x 2 

v/i = v/(X 1 ) = {_j 


The search direction is taken as Si = — V/i = { j } . To find the optimal step length 
Aj along Si, we minimize /(Xj + AiSi) with respect to Ai. Here 

/(X! + A,S,) = n-ku+X.0 = A 2 - 2A, 
df 

— - = 0 at AT = 1 
d A] 1 

Therefore, 


X 2 =Xj + A^Si 



Iteration 2 

Since V/ 2 = V/(X 2 ) = {“}}, Eq. (6.81) gives the next search direction as 

IV f 2 I 2 

S 2 = -V/ 2 + 


where 


Therefore, 


IV/! | 


|V/i| 2 = 2 and |V/ 2 | 2 = 2 


S 2 = 


+ 


0 

+2 


To find AT, we minimize 


/(X 2 + A 2 S 2 ) = /(— 1, 1 + 2A 2 ) 

= -1 - (1 +2A 2 ) + 2-2(l +2A 2 ) + (1 +2A 2 ) 2 
= 4Aj - 2A 2 - 1 

with respect to A 2 . As df/dk 2 = 8A 2 — 2 = 0 at A| = we obtain 


x 3 = x 2 + a*s 2 


1-1 

1 

fo] 

f-1 

1 -1 

+ 4 

2| 

r{ L5 J 
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Thus the optimum point is reached in two iterations. Even if we do not know this point 
to be optimum, we will not be able to move from this point in the next iteration. This 
can be verified as follows. 


Iteration 3 

Now 


V/3 = V/(X 3 ) = 


|V/ 2 | 2 =2, and |V/ 3 | 2 


Thus 


s 3 = -v/ 3 + (|v/ 3 | 2 /|v/ 2 | 2 )S 2 = - 


0 \ fo 
0 


0 . 


This shows that there is no search direction to reduce / further, and hence X 3 is 
optimum. 


6.11 NEWTON'S METHOD 

Newton’s method presented in Section 5.12.1 can be extended for the minimization of 
multivariable functions. For this, consider the quadratic approximation of the function 
/(X) at X = X, using the Taylor’s series expansion 

/(X) = /(X,) + V/; T (X - X,) + I(X - Xi)Vi](X - X,) (6.83) 

where [7,] = [7]|x/ is the matrix of second partial derivatives (Hessian matrix) of / 
evaluated at the point X,. By setting the partial derivatives of Eq. (6.83) equal to zero 
for the minimum of /(X), we obtain 

9/(X) 

4^=0, j — 1,2, ... ,n (6.84) 

dx i 

Equations (6.84) and (6.83) give 

V/ = Vfi + [7 ; ](X - X,-) = 0 (6.85) 

If [7,] is nonsingular, Eqs. (6.85) can be solved to obtain an improved approximation 
(X = X; + i) as 

X/+! = X/ - [7-r 1 V fi (6.86) 

Since higher-order terms have been neglected in Eq. (6.83), Eq. (6.86) is to be used 
iteratively to find the optimum solution X*. 

The sequence of points Xi, X 2 , . . . , X, + i can be shown to converge to the actual 
solution X* from any initial point Xi sufficiently close to the solution X*, provided 
that [7i] is nonsingular. It can be seen that Newton’s method uses the second partial 
derivatives of the objective function (in the form of the matrix [7,]) and hence is a 
second-order method. 
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Example 6.10 Show that the Newton’s method finds the minimum of a quadratic 
function in one iteration. 

SOLUTION Let the quadratic function be given by 

/(X) = ±X T [A]X +B T X +C 

The minimum of /(X) is given by 

V/ = [A]X + B — 0 
or 

X* = — [A]“'B 

The iterative step of Eq. (6.86) gives 

X, +1 = X,- — [A] _1 ([A]X,- + B) (Ej) 

where X, is the starting point for the / th iteration. Thus Eq. (Ej) gives the exact solution 

X, +1 =X* = — [AF'B 

Figure 6.16 illustrates this process. 


Example 6.11 Minimize f(x i , * 2 ) — x\ — X 2 + 2x 2 + 2 x\X 2 + x\ by taking the start- 
ing point as X 1 = {]J}. 


SOLUTION 


To find Xi according to Eq. (6.86), we require [/] | 1 , where 


U 1 ] = 


r a 2 / 

d 2 f 

dxj 

8 x 18 x 2 

d 2 f 

d 2 f 

_dx 2 dx\ 

dx 2 


4 2 
2 2 


-“x, 



-[Al f\ 


Figure 6.16 Minimization of a quadratic function in one step. 
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Therefore, 


As 


[7i] _1 


1 

'+2 -2' 


' 1 1 " 

2 2 

4 

-2 4_ 


.4 1 . 


fdf/dxi 

Uf/dx 2 


X, 


Equation (6.86) gives 


1 + 4x\ + 2 x 2 
{ — 1 + 2xi + 2x2 


(0,0) 



X 2 = X i — [/ 1 ] 1 gi 



To see whether or not X 2 is the optimum point, we evaluate 


92 


1 9//9-M 

1 9// 9x 2 


x 2 


1 + 4xi + 2x2 |0 

f-1 +2xi +2x 2 . (_ lj3 / 2 ) 1® 


As g 2 = 0, X 2 is the optimum point. Thus the method has converged in one iteration 
for this quadratic function. 

If /(X) is a nonquadratic function, Newton’s method may sometimes diverge, and 
it may converge to saddle points and relative maxima. This problem can be avoided 
by modifying Eq. (6.86) as 


X, +1 = X, + A*S, - X,- - (6-87) 


where X* is the minimizing step length in the direction S,- = — [ 7, ] “ 1 V () . The mod- 
ification indicated by Eq. (6.87) has a number of advantages. First, it will find the 
minimum in lesser number of steps compared to the original method. Second, it finds 
the minimum point in all cases, whereas the original method may not converge in some 
cases. Third, it usually avoids convergence to a saddle point or a maximum. With all 
these advantages, this method appears to be the most powerful minimization method. 
Despite these advantages, the method is not very useful in practice, due to the following 
features of the method: 

1. It requires the storing of the n x n matrix [ J, \ . 

2 . It becomes very difficult and sometimes impossible to compute the elements of 
the matrix [/,]. 

3. It requires the inversion of the matrix [7,] at each step. 

4. It requires the evaluation of the quantity [7;] -1 V/, at each step. 

These features make the method impractical for problems involving a complicated 
objective function with a large number of variables. 
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6.12 MARQUARDT METHOD 

The steepest descent method reduces the function value when the design vector X, is 
away from the optimum point X*. The Newton method, on the other hand, converges 
fast when the design vector X, is close to the optimum point X*. The Marquardt method 
[6.15] attempts to take advantage of both the steepest descent and Newton methods. 
This method modifies the diagonal elements of the Hessian matrix, [/,], as 


Ui\ = W] +«,[/] (6-88) 

where [/] is an identity matrix and a,- is a positive constant that ensures the positive 
definiteness of [/,] when [J, | is not positive definite. It can be noted that when a,- is 
sufficiently large (on the order of 10 4 ), the term «, [ / j dominates [/,-] and the inverse 
of the matrix [/, ] becomes 

[JiV 1 = [[Ji] +a,-[/]]- 1 « tomr 1 = -[/] (6.89) 

Oii 

Thus if the search direction S ( - is computed as 

S; = -Wr'Vfi (6-90) 

Si becomes a steepest descent direction for large values of «,■ . In the Marquardt method, 
the value of a, is taken to be large at the beginning and then reduced to zero gradually 
as the iterative process progresses. Thus as the value of a,- decreases from a large value 
to zero, the characteristics of the search method change from those of a steepest descent 
method to those of the Newton method. The iterative process of a modified version of 
Marquardt method can be described as follows. 

1. Start with an arbitrary initial point X] and constants ct\ (on the order of 
10 4 ), ci(0 < ci < 1), C 2 (C 2 > 1), and e (on the order of 10 2 ). Set the iteration 
number as i = 1. 

2. Compute the gradient of the function, V /'- = V/(X,). 

3. Test for optimality of the point X,. If ||V/)|| = ||V/(X,)|| < e, X, is optimum 
and hence stop the process. Otherwise, go to step 4. 

4. Find the new vector X, + i as 

X, +1 = X,- + S ( = X, - [[/,]] + offM]" 1 Vfi (6.91) 

5. Compare the values of f i+ \ and /,-. If f) + \ < f t , go to, step 6. If / i+1 > //, go 
to step 7. 

6. Set a i+ i = c\a t , i = i + 1, and go to step 2. 

7. Set oij — C 20 ij and go to step 4. 

An advantage of this method is the absence of the step size A, along the search 
direction S, . In fact, the algorithm above can be modified by introducing an optimal 
step length in Eq. (6.91) as 

X.-+1 - x, +A*S, - Xi - A*[[/,] + a, [/]]“' V/i 


(6.92) 
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where X* is found using any of the one-dimensional search methods described in 
Chapter 5. 

Example 6.12 Minimize f(x. i, xi) = x\ — X 2 + 2x 2 + 2x \ x 2 + x\ from the starting 
point Xi = {[Jj using Marquardt method with a t — 10 4 , c\ = c 2 —2, and 
e = 10 -2 . 


SOLUTION 
Iteration 1 (i = 1) 

Here f\ = /(X i) = 0.0 and 

3/ 

V/i = 


dxi 

3/ 


dx 2 


1 T- Ax i T- 2x 2 j 
— 1 T- 2x\ “H 2x2 I 


(0,0) 


1 

-1 


( 0 , 0 ) 


Since ||V/i|| = 1.4142 >e, we compute 

3 2 f 3 2 f 


[/)] = 


3x^ 

9 2 


3xiX2 

3 2 f 


x 2 -x, 


(°1 


{oj 

1 

1 


3X[X2 

[[j^+ctrtnr'vfi 

4 + 10 4 2 

2 2 + 10 4 


3r 2 

u 2 J ( 0 , 0 ) 


4 2 
2 2 


-i-i 


-0.9998 | 
1.0000 


10 * 


As f 2 = /(X 2) = —1.9997 x 10 4 < f\, we set 0:2 = ciaj = 2500, i — 2, and proceed 
to the next iteration. 


Iteration 2 (i = 2 ) 

The gradient vector corresponding to X2 is given by Vf 2 — { _? 0000}’ 
1.4141 >s, and hence we compute 

x 3 = x 2 -[[/2] + a2mr 1 v/2 


f -0.9998 x 10“ 4 
( 1.0000 x 10“ 4 


'2504 2' 

| 0.9998 

2 2502_ 

1-1.0000 


J -4.9958 x 10“ 4 
[ 5.0000 x 10“ 4 


IIV/2II = 


Since f 2 = /(X 3) = —0.9993 x 10~ 3 < f 2 , we set a 2 = cia2 = 625, i — 3, and pro- 
ceed to the next iteration. The iterative process is to be continued until the convergence 
criterion, ||V/;|| < e, is satisfied. 
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6.13 QUASI-NEWTON METHODS 

The basic iterative process used in the Newton’s method is given by Eq. (6.86): 

X / _(_! — X/ [*A'] — 1 vy(X j ) (6.93) 

where the Hessian matrix [7,] is composed of the second partial derivatives of / 
and varies with the design vector X, for a nonquadratic (general nonlinear) objective 
function /. The basic idea behind the quasi-Newton or variable metric methods is to 
approximate either [/,] by another matrix [A,] or [7,] -1 by another matrix [ B, J , using 
only the first partial derivatives of /. If [7,] -1 is approximated by [5,], Eq. (6.93) can 
be expressed as 


X = X,- — A.*[B,]V/(X,) (6.94) 

where a* can be considered as the optimal step length along the direction 

S; = -[S,]V/(X ! ) (6.95) 

It can be seen that the steepest descent direction method can be obtained as a special 
case of Eq. (6.95) by setting [5,] = [/]. 

Computation of [B,-]. To implement Eq. (6.94), an approximate inverse of the Hes- 
sian matrix, [B,] = [A,] -1 , is to be computed. For this, we first expand the gradient of 
/ about an arbitrary reference point, Xo, using Taylor’s series as 

V/(X) » V/(X o) + [7 0 ](X - Xo) (6.96) 

If we pick two points X, and X (+ i and use [A,] to approximate [To], Eq. (6.96) can 
be rewritten as 


v/ i+1 = V/(Xo) + [A,'](X, +1 - X 0 ) 

(6.97) 

V/, = V/(X o) + [A,](X, - X 0 ) 

(6.98) 

Subtracting Eq. (6.98) from (6.97) yields 


F 

pL 

II 

<P 

(6.99) 

where 


d, = X (+1 - X, 

(6.100) 

g,- = vf i+l - v/i 

(6.101) 

The solution of Eq. (6.99) for d, can be written as 


d, = [B,]g, 

(6.102) 


where [B,] = [A,] -1 denotes an approximation to the inverse of the Hessian matrix, 
[To] -1 . It can be seen that Eq. (6.102) represents a system of n equations in n 2 unknown 
elements of the matrix [B,]. Thus for n > 1, the choice of [B ( ] is not unique and one 
would like to choose [B ( ] that is closest to [To] -1 , in some sense. Numerous techniques 
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have been suggested in the literature for the computation of [5,] as the iterative process 
progresses (i.e., for the computation of [5,+i] once [ B, | is known). A major concern 
is that in addition to satisfying Eq. (6.102), the symmetry and positive definiteness of 
the matrix [B,] is to be maintained; that is, if [ B, | is symmetric and positive definite, 
[5, + i] must remain symmetric and positive definite. 


6.13.1 Rank 1 Updates 

The general formula for updating the matrix [B,] can be written as 


where [AS,] can be considered to be the update (or correction) matrix added to [5,]. 
Theoretically, the matrix [AS,] can have its rank as high as n. However, in practice, 
most updates, [AB,], are only of rank 1 or 2. To derive a rank 1 update, we simply 
choose a scaled outer product of a vector z for [ A B, | as 


where the constant c and the n -component vector z are to be determined. Equations 
(6.103) and (6.104) lead to 


[S,+i] = [B,] + [AB,] 


(6.103) 


[AS,] = czz j 


(6.104) 


[fl, +1 ] = [£,] + czz T 


(6.105) 


By forcing Eq. (6.105) to satisfy the quasi-Newton condition, Eq. (6.102), 

d, = [B i+1 ]g, 


(6.106) 


we obtain 


d, = ([S,] + czz r )g,- = [S,]g, + cz(z r g ( ) 

Since (z r g,) in Eq. (6.107) is a scalar, we can rewrite Eq. (6.107) as 


(6.107) 


d; - [B, lg, 

z r g , 


(6.108) 


Thus a simple choice for z and c would be 

z = d, — [S,]g, 


(6.109) 


1 



( 6 . 110 ) 


This leads to the unique rank 1 update formula for [S, + i]: 



( 6 . 111 ) 


This formula has been attributed to Broyden [6.16]. To implement Eq. (6.1 1 1), an initial 
symmetric positive definite matrix is selected for [Bj] at the start of the algorithm, and 
the next point X 2 is computed using Eq. (6.94). Then the new matrix [B 2 ] is computed 
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using Eq. (6.111) and the new point X 3 is determined from Eq. (6.94). This iterative 
process is continued until convergence is achieved. If [5,] is symmetric, Eq. (6.111) 
ensures that [5,+i] is also symmetric. However, there is no guarantee that [fi,+i] 
remains positive definite even if [B ( ] is positive definite. This might lead to a breakdown 
of the procedure, especially when used for the optimization of nonquadratic functions. 
It can be verified easily that the columns of the matrix [AS,] given by Eq. (6.1 1 1) are 
multiples of each other. Thus the updating matrix has only one independent column and 
hence the rank of the matrix will be 1. This is the reason why Eq. (6.1 1 1) is considered 
to be a rank 1 updating formula. Although the Broyden formula, Eq. (6.111), is not 
robust, it has the property of quadratic convergence [6.17]. The rank 2 update formulas 
given next guarantee both symmetry and positive definiteness of the matrix [B i+ 1 ] 
and are more robust in minimizing general nonlinear functions, hence are preferred in 
practical applications. 


6.13.2 Rank 2 Updates 

In rank 2 updates we choose the update matrix [AB,] as the sum of two rank 1 
updates as 

[AS,-] =ciZiz7 + c 2 z 2 zT (6.112) 

where the constants ci and c 2 and the n- component vectors Zi and z 2 are to be deter- 
mined. Equations (6.103) and (6.112) lead to 

[B (+1 ] = [B/] + Ci Ziz] + c 2 z 2 zj (6.1 13) 


By forcing Eq. (6.113) to satisfy the quasi-Newton condition, Eq. (6.106), we obtain 

d, = [B,\g, + cizi(z}g,) + c 2 z 2 (zjg,) (6.1 14) 

where (z|g,) and (zTg,) can be identified as scalars. Although the vectors Z\ and z 2 in 
Eq. (6.1 14) are not unique, the following choices can be made to satisfy Eq. (6.114): 


Thus the rank 2 update 
[B i+l ] = [Bi] 


Zi = d, 

(6.115) 

5? 

o? 

II 

<N 

N 

(6.116) 

1 

C 1 — y 

z ift- 

(6.117) 

1 

C2 y 

*2 ft 

(6.118) 

formula can be expressed as 


, , d ' d " ([5,]ft)([B,]ft) T 

+ [AS,] - [fi,] + - ([Bi]a)TSf 

(6.119) 


This equation is known as the Davidon-Fletcher-Powell (DFP) formula [6.20, 6.21]. 
Since 


X/+! = X/ + A.*S; 


( 6 . 120 ) 
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where S,- is the search direction, d, = X, + i — X, can be rewritten as 


d, 




Thus Eq. (6.119) can be expressed as 


[B, +1 ] = [Bi] + 



g if \Bi\g, 


( 6 . 121 ) 


(6.122) 


Remarks: 

1. Equations (6. 1 1 1) and (6. 1 19) are known as inverse update formulas since these 
equations approximate the inverse of the Hessian matrix of /. 

2 . It is possible to derive a family of direct update formulas in which approx- 
imations to the Hessian matrix itself are considered. For this we express the 
quasi-Newton condition as [see Eq. (6.99)] 

g, = [Ai]d, (6.123) 


The procedure used in deriving Eqs. (6.111) and (6.119) can be followed 
by using [A,], d,, and g, in place of [ /?, |, g,, and d ( , respectively. This 
leads to the rank 2 update formula (similar to Eq. (6.119), known as the 
Broydon-Fletcher-Goldfarb-Shanno (BFGS) formula [6.22-6.25]: 


[A/+i] = [A,] + 



([A,]d,)([A,]d,) T 

([A ( ]d,) r d ; - 


(6.124) 


In practical computations, Eq. (6.124) is rewritten more conveniently in terms 
of [B ( ], as 


[£ /+1 ] = [£,] + 




[BteidJ d,g'/[ Bj I 

d/g dfg, 


(6.125) 


3 . The DFP and the BFGS formulas belong to a family of rank 2 updates known 
as Huang’s family of updates [6.18], which can be expressed for updating the 
inverse of the Hessian matrix as 


[B i+ 1 ] = Pi 


- 


[fr]g,g?[£/] 

gjLBjg, 




where 


y i 


(g-[B,]g ,) 1/2 



lfi,ig, \ 
g I W] g, ) 


(6.126) 


(6.127) 


and pi and 9j are constant parameters. It has been shown [6.18] that Eq. (6.126) 
maintains the symmetry and positive definiteness of [B i+ \ \ if [B,] is symmetric 
and positive definite. Different choices of p, and 0/ in Eq. (6.126) lead to 
different algorithms. For example, when p, = 1 and 0, = 0, Eq. (6.126) gives 
the DFP formula, Eq. (6.119). When p,- = 1 and 0, = 1, Eq. (6.126) yields the 
BFGS formula, Eq. (6.125). 
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4. It has been shown that the BFGS method exhibits superlinear convergence near 
X* [6.17]. 

5. Numerical experience indicates that the BFGS method is the best unconstrained 
variable metric method and is less influenced by errors in finding k* compared 
to the DFP method. 

6 . The methods discussed in this section are also known as secant methods since 
Eqs. (6.99) and (6.102) can be considered as secant equations (see Section 5.12). 

The DFP and BFGS iterative methods are described in detail in the following sections. 


6.14 DAVIDON-FLETCHER-POWELL METHOD 


The iterative procedure of the Davidon-Fletcher-Powell (DFP) method can be 
described as follows: 


1. Start with an initial point X i and a n x n positive definite symmetric matrix 
[Bi] to approximate the inverse of the Hessian matrix of /. Usually, \B\\ is 
taken as the identity matrix [/]. Set the iteration number as i = 1. 

2. Compute the gradient of the function, V/i, at point X,-, and set 

S, - (6.128) 

3. Find the optimal step length k* in the direction S, and set 

x i+1 =*i+k*Si (6.129) 

4. Test the new point X,- + | for optimality. If X,-+i is optimal, terminate the iterative 
process. Otherwise, go to step 5. 

5. Update the matrix [B ( ] using Eq. (6.119) as 


[B i+ i] = [Bi\ + [Mi\ + [Ni\ (6.130) 


where 


[M,-] 


[ty] 



«g,]g, ■)([#, ]g,) T 

gf[Bi]gi 


g, = v/(X, +1 ) - v/(X,) = v/; +1 - vfi 


(6.131) 

(6.132) 

(6.133) 


6 . Set the new iteration number as i = i + 1, and go to step 2. 

Note: The matrix [B, + i], given by Eq. (6.130), remains positive definite only if k* 
is found accurately. Thus if k* is not found accurately in any iteration, the matrix [ B, \ 
should not be updated. There are several alternatives in such a case. One possibility is to 
compute a better value of k* by using more number of refits in the one-dimensional min- 
imization procedure (until the product S ( r V/) + i becomes sufficiently small). However, 
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this involves more computational effort. Another possibility is to specify a maximum 
number of refits in the one-dimensional minimization method and to skip the updating 
of [£,] if X* could not be found accurately in the specified number of refits. The last 
possibility is to continue updating the matrix [£,■] using the approximate values of X* 
found, but restart the whole procedure after certain number of iterations, that is, restart 
with i — 1 in step 2 of the method. 


Example 6.13 Show that the DFP method is a conjugate gradient method. 
SOLUTION Consider the quadratic function 


/(X) = ^X t [A]X + b t x +c 

(Ei) 

for which the gradient is given by 


V/ = [A]X + B 

(E 2 ) 

Equations (6.133) and (E?) give 


g,- = Vf i+ i - Vfi = [A](X ;+1 - X,;) 

(E 3 ) 

Since 


X; +1 = x, + x*Sj 

(E 4 ) 

Eq. (E 3 ) becomes 


g,- = x*[A]Si 

(E 5 ) 

or 


[^]S« = ^g, 

(Eg) 

Premultiplication of Eq. (Eg) by [B i+ \] leads to 


[Bi+iHAjS, - -V([B,] + [M t ] + [Ni])g, 

k i 

(E 7 ) 

Equations (6.131) and (E 5 ) yield 


S,S T g, 

[Mi] g, = X* ~xy~~ = X*s, 

s/ g,- 

(Eg) 

Equation (6.132) can be used to obtain 


([fi/ig/Xg^fft) 

[W '» = - 

(Eg) 

since [£,] is symmetric. By substituting Eqs. (Eg) and (Eg) into Eq. 

(E 7 ), we obtain 

[B /+1 ][A]S, - ^(L£,jg, +x*S, - [£,]g,) - S, 

(E 10 ) 
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The quantity Sf +1 [A]S, can be written as 

Sj +l [A]Si = -([B i+1 ]V/ i+1 ) T [A]S i 

= -Vyj^tSi+iltAlS/ = = 0 (En) 

since 1* is the minimizing step in the direction S ( . Equation (En) proves that the 
successive directions generated in the DFP method are [A]-conjugate and hence the 
method is a conjugate gradient method. 


Example 6.14 Minimize f(x i, x 2 ) = 100(x 2 — x 2 ) 2 + (1 — x\') 2 taking Xi = { 2 } as 
the starting point. Use cubic interpolation method for one-dimensional minimization. 

SOLUTION Since this method requires the gradient of /, we find that 

\df/dx i { 400xi (x 2 - x 2 ) - 2(1 - x\) 

1 3 1/79*2 j { — 200(x 2 — x 2 ) 


Iteration 1 

We take 


[B i] = 


1 0 
0 1 


At X i = {Z 2 2 }, V/, = V/(X,) = {;^} and /, = 3609. Therefore, 


Si = -[fliiv/j = 


{4806 
1 1200 


By normalizing, we obtain 


[(4806) 2 + (1200) 2 ] 1 / 2 


{4806 

{0.970 

1 1200 

~ 1 0.244 


To find /,*, we minimize 


/(X i +AjSi) = /(- 2 + 0.970k!, -2 + 0.24410 

= 100(6 - 4.1241! + 0.9381j) 2 + (3 - 0.971i) 2 (E0 

with respect to Equation (Ei) gives 

= 200(6 - 4.1241i + 0.938l{)(1.8761i - 4.124) - 1.94(3 - 0.9710 

dX i 

Since the solution of the equation df/dX i = 0 cannot be obtained in a simple manner, 
we use the cubic interpolation method for finding 1*. 
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Cubic Interpolation Method (First Fitting) 


Stage 1: As the search direction Sj is normalized already, we go to stage 2. 

Stage 2: To establish lower and upper bounds on the optimal step size X*, we have to 
find two points A and B at which the slope df/dX \ has different signs. We 
take A = 0 and choose an initial step size of to = 0.25 to find B. 

At M = A = 0: 

fk = /(*i =A=0)= 3609 


fk = 


df_ 

dX\ 


A.,=A=0 


-4956.64 


At kj — t^ — 0.25: 


/ = 2535.62 
df 

-L- = -3680.82 
d a ] 


As df/dX i is negative, we accelerate the search by taking X \ = 4fo = 1.00. 
At Xi = 1.00: 


/ = 795.98 
df 

— = -1269.18 
dX\ 

Since df/dX \ is still negative, we take Ai = 2.00. 

At X { = 2.00: 

/ = 227.32 
df 

— = -113.953 
dX i 

Although df/dXi is still negative, it appears to have come close to zero and 
hence we take the next value of A.i as 2.50. 

At Xi = 2.50: 


/ = 241.51 

df 

-a— — 174.684 = positive 
dX\ 

Since df/dX \ is negative at li = 2.0 and positive at A.i = 2.5, we take A = 
2.0 (instead of zero for faster convergence) and B — 2.5. Therefore, 

A — 2.0, f A = 227.32, f A = - 113.95 

B — 2.5, f B — 241.51, f’ B = 174.68 

Stage 3: To hnd the optimal step length X* using Eq. (5.54), we compute 


3(227.32-241.51) 
2.5 -2.0 


- 113.95 + 174.68 = -24.41 


Q = [(24.41 ) 2 + (1 1 3 .95) (1 74.68)] 1/2 = 143.2 
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Therefore, 


A* =2.0 + 
= 2.2 


-113.95 -24.41 + 143.2 
113.95 + 174.68 — 48.82 


(2.5 - 2.0) 


= -0.818 


Stage 4: To find whether A* is close to A*, we test the value of df/dk \ . 

df_ 
dk\ 

Also, 

/(A! = Ap =216.1 

Since df/dk\ is not close to zero at A*, we use a refitting technique. 

Second Fitting: Now we take A = A* since df/dk \ is negative at A* and B = 2.5. 
Thus 

A = 2.2, f A = 216.10, f' A = -0.818 
B — 2.5, f B = 241.51, f' B = 174.68 

With these values we find that 

3(216.1 -241.51) 


Z = 


2.818 + 174.68 = -80.238 


2.5 -2.2 

Q = [(80.238) 2 + (0.818)(174.68)] 1/2 = 81.1 
-0.818 - 80.238 + 81.1 


At = 2.2 + 


-0.818 + 174.68 - 160.476 


(2.5 -2.2) =2.201 


To test for convergence, we evaluate df/dk at A*. Since df/dk\ Xi= ^* = —0.211, it can 

be assumed to be sufficiently close to zero and hence we take At — At = 2.201. This 
gives 


X 2 =X! + AtSi - 



0.970A 


0.244At 



Testing X 2 for convergence: To test whether the D-F-P method has converged, 
we compute the gradient of / at X 2 : 


V/ 2 


[df/dxA 

= (a//9x 2 J = 


78.29 | 
-296.24 


As the components of this vector are not close to zero, X 2 is not optimum and hence 
the procedure has to be continued until the optimum point is found. 

Example 6.15 Minimize f{x i, x 2 ) = x\ — x 2 + 2x\ + 2v|.r 2 + xj from the starting 
point Xi = {[[} using the DFP method with 


[fill = 


1 0 
0 1 


e =0.01 
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SOLUTION 


Iteration 1 (i = 1) 
Here 


v/i = V/(XO = 

and hence 

Si = -[Bi]V/i - 
To find the minimizing step length 1* along Si, we minimize 


1 + 4xj + 2x 2 

= | 

1 

{ — 1 + 2x\ + 2x 2 

(0,0) 1 

-1 


'1 O' 

1 H 

H 


1 = “ 

0 1 

t-lj 

= 1 1 



+ 


-1 

1 


/(X 1 +A. 1 S 1 ) = / 
with respect to A]. Since df/dX \ = 0 at A* = 1, we obtain 
X 2 =X, +AtSi - ' 


= /(-A 1 ,A 1 )=A?-2A 1 


fol 

, f-1 

f-ll 

|o| 

l + 1 l Ij 

1=1 >) 


Since V/ 2 = V/(X 2 ) = {_ J} and ||V/ 2 || = 1.4142 > e, we proceed to update the 
matrix [5,] by computing 



SiS 


T 

1 



1 } 


1 -1 

-1 1 


[*i]g, = 


"1 o ' 

Ml 

- !~ 2 ] 

0 1 

1 °J 

l ~ l oj 


([5i]g,) T 



{-2 0 } 


gfrfliigi - {-2 


'1 o' 

[ —2] 

1 ^1 f-2 

0 1 

1 0J 

“ 1-2 °)| o) 


[Mj] = At 


sis; = t f i 


s}g 


i -l 
-l i 



r 

2 

1 

2 _ 
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[#l] = — 


([5i]gi)([5i]gi) T 1 0 

{-2 0} ! 

'4 o' 


'1 o' 

g[[fit]gi 

4 4 

0 0 


0 0 


[B 2 ] = [£i] + [Mi] + [Ni] = 


Iteration 2 (i = 2 ) 

The next search direction is determined as 




r 1 

11 





'1 o' 


2 2 


'-1 O' 


0.5 -0.5' 

0 1 

+ 

1 

1 

+ 

0 0 


-0.5 1.5 



~2 

2. 






0.5 

-0.5' 

Hi 

!°] 

-0.5 

1.5 

i-lj 

rH 


-[B 2 ]Vf 2 = - 


To find the minimizing step length XX along S2, we minimize 


/(X 2 + A 2 s 2 ) = / 


-1 

1 


+ x 2 


f 


-1 

1 + x 2 


= -l-(l+X 2 ) + 2( — 1 ) 2 + 2( — 1 )(1 + X 2 ) + (1 + X 2 ) 2 
= X\-X 2 -\ 

with respect to X 2 . Since df/dX 2 = 0 at = 5, we obtain 

x 3 = x 2 + a* f - n 


f-n 

1 

fol 

f-1 

1 >i 

+ 2 

l>l 

rl i-5j 


This point can be identified to be optimum since 

[0 


V / 3 = 


and ||V/ 3 ||=0<£ 


6.15 BROYDEN-FLETCHER-GOLDFARB-SHANNO METHOD 

As stated earlier, a major difference between the DFP and BFGS methods is that in 
the BFGS method, the Hessian matrix is updated iteratively rather than the inverse of 
the Hessian matrix. The BFGS method can be described by the following steps. 

1. Start with an initial point X 1 and an x n positive definite symmetric matrix [B\\ 
as an initial estimate of the inverse of the Hessian matrix of /. In the absence 
of additional information, \B\\ is taken as the identity matrix [/]. Compute the 
gradient vector V/j = V/(Xi) and set the iteration number as / = I . 

2. Compute the gradient of the function, V/}, at point X,-, and set 


Si = ~[Bj]Vfi 


(6.134) 
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3. Find the optimal step length X* in the direction S, and set 

x, +1 = X, ■ + A.*S; 


(6.135) 


4. Test the point X (+ i for optimality. If | V/) + | || < e, where s is a small quantity, 
take X* X,-+i and stop the process. Otherwise, go to step 5. 

5. Update the Hessian matrix as 


[B, :+1 ] = [B,j + ( 1 
where 


gf[Bi]g, \ d,d[ d,g ; r [fi ; ] [B,]g,d , 7 


*1$ 


d, r g, df g, 


d ft 


d, - x i+1 - x / - x*Si 

g, = vf i+ i - vfi = v/(X /+ i) - v/(X,) 


(6.136) 

(6.137) 

(6.138) 


6 . Set the new iteration number as i = i + 1 and go to step 2. 

Remarks: 

1. The BFGS method can be considered as a quasi-Newton, conjugate gradient, 
and variable metric method. 

2. Since the inverse of the Hessian matrix is approximated, the BFGS method can 
be called an indirect update method. 

3. If the step lengths X* are found accurately, the matrix, | B, J, retains its positive 
definiteness as the value of i increases. However, in practical application, the 
matrix [B,] might become indefinite or even singular if X* are not found accu- 
rately. As such, periodical resetting of the matrix [B,] to the identity matrix [/] 
is desirable. However, numerical experience indicates that the BFGS method is 
less influenced by errors in X* than is the DFP method. 

4. It has been shown that the BFGS method exhibits superlinear convergence near 
X* [6.19]. 


Example 6.16 Minimize f(x\,xo) = x\ — X 2 + 2x\ + 2x\x^ + x^ from the starting 
point Xi = Q using the BFGS method with 


[Si] - 


1 0 
0 1 


e = 0.01. 


SOLUTION 
Iteration 1 (i = 1) 
Here 


and hence 


v/t = V/(X0 


s, = 


1 + 4x\ + 2x 2 

1 

1 

| — 1 + 2xi + 2x2 

(0,0) 1 

-1 


'1 O' 

I M 

I- 1 


1 = — 

0 1 

t-lj 

= 1 1 
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To find the minimizing step length A* along Si, we minimize 


f(X l +X 1 S l ) = f 



= /(-*!,*!> = *! -2*1 


with respect to Xj. Since df/dX \ = 0 at A* = 1, we obtain 


X 2 = X 1 +AtSi = 



Since V/ 2 = V/(X 2 ) = {_[} and ||V/ 2 || = 1.4142 > e, we proceed to update the 
matrix [ B, | by computing 


9l = V/2 - V/! 
di = A.*S] = 1 
did} - ' 

d]gi = {-i 1} 
digf - 
gidT = 

g|[5i]gi = {-2 0} 


-l 

i 


(- 1 

l 

1-1 

ni 

i-l 


H 

i 


i 

i-i 


f-ii 


i -f 

i M 

{-i i}= 

-i i 


= 2 


f-ii 

T 

K> 

O 

II 

2 O' 

i H 

-2 0 

\-2] 

(-1 i} = 

f2 -21 

1 0| 

1 

O 1 

O 1 
1 


1 0 
0 1 


-2 

0 


= {-2 0 } 


-2 

0 


= 4 


dig|[5,] = 


[^igidT = 


2 O' 

n o' 


2 O' 

-2 0 

[° 1 


-2 0 

'i o] 

'2 -2 


'2 -2 

0 1 

0 0 


0 0 


Equation (6.136) gives 


1 O' 

/, 4\ 1 

1 -1' 

1 

2 O' 

1 

f2 —2l 

0 1 

+ ( 1 + 2j2 

-1 1 

~ 2 

-2 0 

~ 2 

1 

o 

o 

1 


1 o' 

+ 

- 3 

2 

3 " 

2 


1 O' 


'1 

- 1 ' 


- 1 

2 

1 - 

2 

0 1 

3 

3 


-1 0 

— 

0 

0 

— 

1 

5 



- 2 

2- 







- 2 

2- 


[5 2 ]| = 


Iteration 2 (i = 2 ) 

The next search direction is determined as 
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S 2 = -[fl 2 ]V/2 = - 
To find the minimizing step length X* along Si, we minimize 


- 1 

2 

1 ' 

2 

I ” 11 

l-( 0 ) 

1 

- 2 

5 

2 - 

1-lJ 

1 _ 1 2 J 


/ (X 2 + A.2S2) = / 


f-ll 

[0 

1 '1 

+ k 2 j 2 


/(- 1, 1 + 2 a 2 ) = -2 X2-1 


with respect to X 2 . Since df/dX 2 = 0 at X% — 4, we obtain 

x 3 = x 2 + ^s 2 = 

This point can be identified to be optimum since 


j-n 

1 

!°l 

HI 

1 >i 

+ 4 


rl l) 


V/ 3 = 


and || Vf 3 1| = 0 < e 


6.16 TEST FUNCTIONS 

The efficiency of an optimization algorithm is studied using a set of standard func- 
tions. Several functions, involving different number of variables, representing a variety 
of complexities have been used as test functions. Almost all the test functions pre- 
sented in the literature are nonlinear least squares; that is, each function can be 
represented as 

m 

fix I,x 2 , . . ■ , X n ) = J^fi(x 1 ,x 2 , ■ ..,x n ) 2 (6.139) 

<=1 

where n denotes the number of variables and m indicates the number of functions ( /)■) 
that define the least-squares problem. The purpose of testing the functions is to show 
how well the algorithm works compared to other algorithms. Usually, each test function 
is minimized from a standard starting point. The total number of function evaluations 
required to find the optimum solution is usually taken as a measure of the efficiency of 
the algorithm. References [6.29] to [6.32] present a comparative study of the various 
unconstrained optimization techniques. Some of the commonly used test functions are 
given below. 

1. Rosenbrock’s parabolic valley [6.8]: 

fix 1, * 2 ) = 100(* 2 - x\f + (1 - xi) 2 (6.140) 



/1 = 24.0, f* = 0.0 
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2 . A quadratic function: 


f(xi, x 2 ) — ( x i + 2x 2 - 7) 2 + (2xi +X 2 - 5) 2 


f\ = 7.40, /* = 0.0 

3 . Powell’s quartic function [6.7]: 

/(x 1, X2, X3 , X4) = (xi + IO.X2) 2 + 5 (x 3 - X4) 2 

+ (X2 — 2x 3 ) 4 + 10(xi — X4) 4 
X} = {x 1 x 2 x 3 X 4 }j = [3 —10 1}, X* T = {0 0 0 0) 
/1 = 215.0, /* = 0.0 

4 . Fletcher and Powell’s helical valley [6.21]: 

/(x!,x 2 ,x 3 ) = 100 |[x 3 - 106 »(xi,x 2 )] 2 + [yj x\+x]- l] 2 
where 


2tt0(xi, X2) = 


arctan 


X2 

Xi 


if xi >0 


*2 


n + arctan — if xi <0 
xi 



-1 


1 

X, = 

0 

, X* = 

0 


0 


0 


f\ = 25,000.0, f* = 0.0 
5 . A nonlinear function of three variables [6.7]: 

1 /I 

/(X1,X 2 ,X 3 ) = 


1 + (x 1 - x 2 ) 2 V 2 


■ exp 


- + sin ^ - 71 x 2 x 3 
2 


Xi + x 3 
x 2 


0 


1 

1 

, X* = 

1 

2 


1 


fi — 1-5, r = / max = 3.0 

6 . Freudenstein and Roth function [6.27]: 

fix i,x 2 ) = [-13 +xi + [(5 -x 2 )x 2 - 2]x 2 } 2 

+ [—29 + xi + [(x 2 + l)x 2 — 14]x 2 } 2 


(6.141) 


(6.142) 


(6.143) 


(6.144) 


(6.145) 
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Xi = 


0.5 

-2 


'alternate 


11.41 . . . 

-0.8968 . . . 


fl = 400.5, /* = 0.0, /al tem a t e ~ 48.9842 . . . 


7. Powell's badly scaled function [6.28]: 


fix i,x 2 ) — (10,000xiX2 — l) 2 + [exp(— xi) 4-exp(— x 2 ) — 1.0001] 2 (6.146) 


X, - 


X* = 


1.098.. . x 10“ 5 

9.106.. . 


/i = 1.1354, /* = 0.0 

8 . Brown's badly scaled function [6.29]: 

fix i , x 2 ) = (xi - 10 6 ) 2 + (x 2 - 2 x 10 -6 ) 2 + (xix 2 - 2) 2 


(6.147) 


X* = 

fi » 10 12 , f* = 0.0 


10 6 

2 x 10 -6 


9. Beale’s function [6.29]: 


fix i, x 2 ) = [1.5 - xi (1 - x 2 )] 2 + [2.25 - xi (1 - x 2 )] 2 


3\n2 


+ [2.625 — xi(l — Xt)] 


(6.148) 


Xi 


X* = 


3 

0.5 


/i = 14.203125, /* = 0.0 

10. Wood’s function [6.30]: 

fix i, x 2 , x 3 , x 4 ) = [10(x 2 -xf)] 2 + (1 - xi) 2 + 90(x 4 - x 2 ) 2 

4” (1 — x 3 ) 2 4~ 10 (x 2 4“ x 4 — 2) 2 4~ 0. 1 (x 2 — x 4 ) (6. 149) 


-3' 


1 

-1 


1 

-3 

, X* = ■ 

1 

-1 


1 


Xi = 

/i = 19192.0, /*= 0.0 


6.17 MATLAB SOLUTION OF UNCONSTRAINED OPTIMIZATION 
PROBLEMS 

The solution of multivariable unconstrained minimization problems using the MATLAB 
function f mi nunc is illustrated in this section. 

Example 6.17 Find the minimum of the Rosenbrock’s parabolic valley function, given 
by Eq. (6.140), starting from initial point Xi = {— 1.2 1.0} T . 
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SOLUTION 

Step 1: Write an M-file objfun.m for the objective function. 

function f= objfun (x) 

f= 100* (x (2 ) -x ( 1 ) *x (1) ) A 2+ (1-x (1) ) A 2; 

Step 2: Invoke unconstrained optimization program (write this in new MATLAB hie). 

clc 

clear all 
warning off 

xO = [-1.2, 1.0]; % Starting guess 

fprintf ('The values of function value at starting 
pointn ' ) ; 
f=ob jfun (xO) 

options = optimset ( ' LargeScale ' , 'off'); 

[x, fval] = fminunc ( Sob j fun, xO , options ) 

This produces the solution or ouput as follows: 

The values of function value at starting point 
f= 

24.2000 

Optimization terminated: relative infinity-norm of gradi- 
ent less than options . TolFun . 
x= 

1.0000 1.0000 
fval = 

2 . 8336e-011 
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REVIEW QUESTIONS 

6.1 State the necessary and sufficient conditions for the unconstrained minimum of a function. 

6.2 Give three reasons why the study of unconstrained minimization methods is important. 

6.3 What is the major difference between zeroth-, first-, and second-order methods? 

6.4 What are the characteristics of a direct search method? 

6.5 What is a descent method? 

6.6 Define each term: 

(a) Pattern directions 

(b) Conjugate directions 

(c) Simplex 

(d) Gradient of a function 

(e) Hessian matrix of a function 

6.7 State the iterative approach used in unconstrained optimization. 

6.8 What is quadratic convergence? 

6.9 What is the difference between linear and superlinear convergence? 

6.10 Define the condition number of a square matrix. 

6.11 Why is the scaling of variables important? 

6.12 What is the difference between random jumping and random walk methods? 

6.13 Under what conditions are the processes of reflection, expansion, and contraction used in 
the simplex method? 

6.14 When is the grid search method preferred in minimizing an unconstrained function? 

6.15 Why is a quadratically convergent method considered to be superior for the minimization 
of a nonlinear function? 
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6.16 Why is Powell’s method called a pattern search method? 

6.17 What are the roles of univariate and pattern moves in the Powell’s method? 

6.18 What is univariate method? 

6.19 Indicate a situation where a central difference formula is not as accurate as a forward 
difference formula. 

6.20 Why is a central difference formula more expensive than a forward or backward difference 
formula in finding the gradient of a function? 

6.21 What is the role of one-dimensional minimization methods in solving an unconstrained 
minimization problem? 

6.22 State possible convergence criteria that can be used in direct search methods. 

6.23 Why is the steepest descent method not efficient in practice, although the directions used 
are the best directions? 

6.24 What are rank 1 and rank 2 updates? 

6.25 How are the search directions generated in the Fletcher-Reeves method? 

6.26 Give examples of methods that require n 2 , n , and 1 one-dimensional minimizations for 
minimizing a quadratic in n variables. 

6.27 What is the reason for possible divergence of Newton’s method? 

6.28 Why is a conjugate directions method preferred in solving a general nonlinear problem? 

6.29 What is the difference between Newton and quasi-Newton methods? 

6.30 What is the basic difference between DFP and BFGS methods? 

6.31 Why are the search directions reset to the steepest descent directions periodically in the 
DFP method? 

6.32 What is a metric? Why is the DFP method considered as a variable metric method? 

6.33 Answer true or false: 

(a) A conjugate gradient method can be called a conjugate directions method. 

(b) A conjugate directions method can be called a conjugate gradient method. 

(c) In the DFP method, the Hessian matrix is sequentially updated directly. 

(d) In the BFGS method, the inverse of the Hessian matrix is sequentially updated. 

(e) The Newton method requires the inversion of an n x n matrix in each iteration. 

(f) The DFP method requires the inversion of an n x n matrix in each iteration. 

(9) The steepest descent directions are the best possible directions. 

(h) The central difference formula always gives a more accurate value of the gradient 
than does the forward or backward difference formula. 

(i) Powell’s method is a conjugate directions method. 

(j ) The univariate method is a conjugate directions method. 
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PROBLEMS 


6.1 A bar is subjected to an axial load, Pq, as shown in Fig. 6.17. By using a one-finite-element 
model, the axial displacement, u{x), can be expressed as [ 6 . 1 ] 


u(x) = {N\(x) N 2 (x)} 

where N,(x) are called the shape functions: 

Nt(x) = l-j, N 2 (x) = j 

and u\ and n 2 are the end displacements of the bar. The deflection of the bar at point 
Q can be found by minimizing the potential energy of the bar (/), which can be 
expressed as 

1 [' ( du\ 2 

f = 2 J 0 dX - P ° U 2 

where E is Young’s modulus and A is the cross-sectional area of the bar. Formulate the 
optimization problem in terms of the variables u \ and u 2 for the case PqI/EA = 1. 

6.2 The natural frequencies of the tapered cantilever beam ( a > ) shown in Fig. 6.18, based on 
the Rayleigh-Ritz method, can be found by minimizing the function [6.34]: 



/(ci,c 2 ) 


Elp ( c j_ £2 £t£2 \ 

3/ 2 V 4 10 5 ) 


phi 


— — H — + 

30 280 


2cic 2 \ 

105 ) 


with respect to c i and c 2 , where / = a> 2 , E is Young’s modulus, and p is the density. 
Plot the graph of 3 f pi 3 /Eh 1 in (ci, c 2 ) space and identify the values of a > i and a> 2 . 

6.3 The Rayleigh’s quotient corresponding to the three-degree-of-freedom spring-mass sys- 
tem shown in Fig. 6.19 is given by [6.34] 


where 


R(X) 


X r [/nX 

X r [M]X 


[K] = k 

'2-1 O' 

-1 2 -1 

, [M] = 

'10 0 
0 1 0 

. x-| 

Xl 

x 2 


0 -1 1 


0 0 1 


X3 


It is known that the fundamental natural frequency of vibration of the system can be 
found by minimizing R(X). Derive the expression of R(X) in terms of xi, x 2 , and JC 3 and 
suggest a suitable method for minimizing the function R(X). 


1 


-► ► u{x) 

u \ 

. i 


Q 

2 


*~ p o 

u 2 


Figure 6.17 Bar subjected to an axial load. 
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t 




Figure 6.19 Three-degree-of-freedom spring-mass system. 


6.4 The steady-state temperatures at points 1 and 2 of the one-dimensional fin (*i and xo) 
shown in Fig. 6.20 correspond to the minimum of the function [6.1J: 

/(* i, * 2 ) = 0.6382*^ + 0.3191x| — 0.2809*1X2 
- 67.906.vi - 14.290*2 


Plot the function / in the (*i,* 2 ) space and identify the steady-state temperatures of 
points 1 and 2 of the fin. 
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140°C 


2 cm dia. k = 70 W/cm - °C 


1 

/ 

’ 1 

'*1 2 1 

4 


7™ = 

40°C 


Figure 6.20 Straight fin. 


6.5 Figure 6.21 shows two bodies, A and B, connected by four linear springs. The springs are 
at their natural positions when there is no force applied to the bodies. The displacements 
x\ and X 2 of the bodies under any applied force can be found by minimizing the potential 
energy of the system. Find the displacements of the bodies when forces of 10001b and 
20001b are applied to bodies A and B, respectively, using Newton’s method. Use the 
starting vector, X] = {][}. Hint: 

Potential energy of the system = strain energy of springs — potential of applied loads 


where the strain energy of a spring of stiffness k and end displacements x\ and X 2 is 
given by \k{x 2 — xi) 2 and the potential of the applied force, Fj, is given by XiF t . 

6.6 The potential energy of the two-bar truss shown in Fig. 6.22 under the applied load P is 
given by 


f(Xl,X 2 ) 


EA 

s 





— Px\ cos 6 — Px 2 sin 0 


where E is Young’s modulus, A the cross-sectional area of each member, / the span of 
the truss, s the length of each member, h the depth of the truss, 6 the angle at which load 
is applied, x\ the horizontal displacement of free node, and X 2 the vertical displacement 
of the free node. 

(a) Simplify the expression of / for the data E = 207 x 10 9 Pa, A = 10 -5 m 2 , l = 1.5 m, 
h = 4 m, P = 10,000 N, and 0 = 30°. 
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Figure 6.21 Two bodies connected by springs. 
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(b) Find the steepest descent direction, Si, of / at the trial vector Xj = {°}. 

(c) Derive the one-dimensional minimization problem, f(X), at Xi along the direction 

S,. 

(d) Find the optimal step length X* using the calculus method and find the new design 
vector X 2 . 


6.7 Three carts, interconnected by springs, are subjected to the loads Pi , P 2 , and P 3 as shown 
in Fig. 6.23. The displacements of the carts can be found by minimizing the potential 
energy of the system (/): 


/(X) = ix T [znx -x t p 


where 


[K\ 


k\ -f- k 4 T- k$ 
—k4 

-k 5 


— k 4 — k$ 

&2 T £4 T ks —k(, 

—k(, k 3 + k 3 + k(, + k-j + k% 



Pi 


Xl 

p = 

p 2 

and X = 

X2 


P3 


X3. 


Derive the function f(x.\, X 2 , x$) for the following data: k\ = 5000 N/m , k^ = 1500 N/m, 
k 3 = 2000 N/m, k 4 = 1000 N/m, k 5 = 2500 N/m, k 6 = 500 N/m, k 7 = 3000 N/m, k s = 
3500 N/m, Pi = 1000 N, P 2 = 2000 N, and P 3 = 3000 N. Complete one iteration of 
Newton’s method and find the equilibrium configuration of the carts. Use Xi = {0 0 0} T . 

6.8 Plot the contours of the following function over the region (—5 < x\ < 5, — 3 < x 3 < 6) 
and identify the optimum point: 


f(x 1 , xi) = (x\ + 2x2 — 7 ) 2 + (2xi +X 2 — 5 ) 2 
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Figure 6.23 Three carts interconnected by springs. 


6.9 Plot the contours of the following function in the two dimensional (jci , x 2 ) space over the 
region (—4 < X] < 4, — 3 < x 2 < 6) and identify the optimum point: 

/On, * 2 ) = 2(x 2 - x 2 ) 2 + (1 — 1 ) 2 

6.10 Consider the problem 

/ On, x 2 ) = lOOfe - x\f + (l - xif 

Plot the contours of / over the region (—4 < x\ < 4, — 3 < x 2 < 6 ) and identify the 
optimum point. 

6.11 It is required to find the solution of a system of linear algebraic equations given by 
[A]X = b, where [A] is a known n x n symmetric positive-definite matrix and b is an 
n -component vector of known constants. Develop a scheme for solving the problem as 
an unconstrained minimization problem. 

6.12 Solve the following equations using the steepest descent method (two iterations only) 
with the starting point, Xi = {0 0 0 ): 

2x\ + x 2 = 4. x\ + 2 x 2 + X 3 = 8 , X 2 + 3 x 3 = 11 

6.13 An electric power of 100 MW generated at a hydroelectric power plant is to be transmitted 
400 km to a stepdown transformer for distribution at 1 1 kV. The power dissipated due to 
the resistance of conductors is i 2 c -1 , where i is the line current in amperes and c is the 
conductance in mhos. The resistance loss, based on the cost of power delivered, can be 
expressed as 0.15/ 2 c _1 dollars. The power transmitted ( k ) is related to the transmission 
line voltage at the power plant (e) by the relation k = V3ei, where e is in kilovolts. The 
cost of conductors is given by 2 c millions of dollars, and the investment in equipment 
needed to accommodate the voltage e is given by 500e dollars. Find the values of e and 
c to minimize the total cost of transmission using Newton's method (one iteration only). 

6.14 Find a suitable transformation of variables to reduce the condition number of the Hessian 
matrix of the following function to one: 


/ = 2x 2 + 16xf — 2 xiX 2 — xi — 6 x 2 — 5 
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6.15 


6.16 


6.17 


6.18 


6.19 


6.20 

6.21 

6.22 

6.23 

6.24 

6.25 


Find a suitable transformation or scaling of variables to reduce the condition number of 
the Hessian matrix of the following function to one: 

/ = 4x 2 + 3x2 ~ 5xiX2 — 8xi + 10 


Determine whether the following vectors serve as conjugate directions for minimizing the 
function / = 2x 2 + 16x| — 2 xiX 2 — xi — 6x2 — 5. 


(a) S, 

(b) S, 


15 

-1 

-1 

15 


S 2 

s 2 


1 

1 

1 

1 


Consider the problem: 

Minimize / = xi — X2 + 2x 2 + 2 xjX 2 + x? 

Find the solution of this problem in the range —10 < Xi < 10, i = 1,2, using the random 
jumping method. Use a maximum of 10,000 function evaluations. 

Consider the problem: 

Minimize / = 6x 2 — 6x1x2 + 2xf — xi — 2x2 

Find the minimum of this function in the range —5 < x,- < 5, i = 1, 2, using the random 
walk method with direction exploitation. 

Find the condition number of each matrix. 

1 2 
1.0001 2 

3.9 1.6] 

6.8 2.9 

Perform two iterations of the Newton’s method to minimize the function 
fix 1, x 2 ) = 100(x 2 - x 2 ) 2 + (1 - xi) 2 

from the starting point 

Perform two iterations of univariate method to minimize the function given in Prob- 
lem 6.20 from the stated starting vector. 

Perform four iterations of Powell’s method to minimize the function given in Problem 
6.20 from the stated starting point. 

Perform two iterations of the steepest descent method to minimize the function given in 
Problem 6.20 from the stated starting point. 

Perform two iterations of the Fletcher-Reeves method to minimize the function given in 
Problem 6.20 from the stated starting point. 

Perform two iterations of the DFP method to minimize the function given in Problem 
6.20 from the stated starting vector. 


(a) [A] = 

(b) [B] = 


6.26 Perform two iterations of the BFGS method to minimize the function given in Problem 
6.20 from the indicated starting point. 
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6.27 Perform two iterations of the Marquardt’s method to minimize the function given in 
Problem 6.20 from the stated starting point. 

6.28 Prove that the search directions used in the Fletcher- Reeves method are [A]-conjugate 
while minimizing the function 

f{x\,x 2 ) = xf +4xl 


6.29 Generate a regular simplex of size 4 in a two-dimensional space using each base point: 

-t\ (b) {il ®{:£ 


(a) 


6.30 Find the coordinates of the vertices of a simplex in a three-dimensional space such that 
the distance between vertices is 0.3 and one vertex is given by (2, —1, —8). 


6.31 Generate a regular simplex of size 3 in a three-dimensional space using each base point. 

(a) Jo J (b) J 3 J ( £ )J- 2 J 

6.32 Find a vector S 2 that is conjugate to the vector 


Si = 

with respect to the matrix: 

[A] = 

6.33 Compare the gradients of the function /(X) = 100(x2 — x 2 ) 2 + (1 — xi) 2 at X = j®' 
given by the following methods: 

(a) Analytical differentiation 

(b) Central difference method 

(c) Forward difference method 

(d) Backward difference method 

Use a perturbation of 0.005 for x\ and *2 in the finite-difference methods. 



6.34 It is required to evaluate the gradient of the function 


/(* 1 , * 2 ) = 100(.r 2 - xj) 2 + (1 - Xl f 


at point X = {q^} using a finite-difference scheme. Determine the step size Ax to be 
used to limit the error in any of the components, 3 //3x,-, to 1 % of the exact value, in 
the following methods: 

(a) Central difference method 

(b) Forward difference method 

(c) Backward difference method 


Problems 
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6.35 Consider the minimization of the function 

_ 1 
^ x 2 + x| + 2 

Perform one iteration of Newton’s method from the starting point Xi = {^1 using 
Eq. (6.86). How much improvement is achieved with X2? 

6.36 Consider the problem: 

Minimize / = 2(xi — x 2 ) 2 + (1 — xi) 2 
If a base simplex is defined by the vertices 



find a sequence of four improved vectors using reflection, expansion, and/or contraction. 

6.37 Consider the problem: 

Minimize / = (x 1 4- 2x2 — 7) 2 + {2x\ + X 2 — 5) 2 
If a base simplex is defined by the vertices 



find a sequence of four improved vectors using reflection, expansion, and/or contraction. 

6.38 Consider the problem: 

/ = 100(x 2 — x 2 ) 2 + (1 — xi) 2 

Find the solution of the problem using grid search with a step size Ax,- = 0. 1 in the range 
—3 < Xi < 3, / = 1, 2. 

6.39 Show that the property of quadratic convergence of conjugate directions is independent 
of the order in which the one-dimensional minimizations are performed by considering 
the minimization of 

/ = 6x 2 + 2x 2 — 6x1x2 — xi — 2x2 

using the conjugate directions S| = {i} and S2 = {q} and the starting point Xi = {[]}. 

6.40 Show that the optimal step length A* that minimizes /(X) along the search direction 
S,- = —V/; is given by Eq. (6.75). 

6.41 Show that j3 2 in Eq. (6.76) is given by Eq. (6.77). 

6.42 Minimize / = 2x 2 + x| from the starting point (1.2) using the univariate method (two 
iterations only). 

6.43 Minimize / = 2x^ + x| by using the steepest descent method with the starting point 
(1,2) (two iterations only). 

6.44 Minimize / = x 2 + 3xj + 6xj by the Newton’s method using the starting point as 
(2, -1,1). 
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6.45 Minimize / = 4xf + 3x 2 — 5x1*2 — 8xi starting from point (0, 0) using Powell’s method. 
Perform four iterations. 

6.46 Minimize fix i, * 2 ) = x* — 2x 2 *2 + xf + x| + 2xi + 1 by the simplex method. Perform 
two steps of reflection, expansion, and/or contraction. 

6.47 Solve the following system of equations using Newton’s method of unconstrained mini- 
mization with the starting point 

x, = joj 

2xi — *2 + *3 = — 1, xi + 2*2 = 0, 3xi + *2 + 2*3 = 3 

6.48 It is desired to solve the following set of equations using an unconstrained optimization 
method: 

x 2 + y 2 = 2, 10x 2 - lOy - 5x + 1 = 0 

Formulate the corresponding problem and complete two iterations of optimization using 
the DFP method starting from X| = j[j}. 

6.49 Solve Problem 6.48 using the BFGS method (two iterations only). 

6.50 The following nonlinear equations are to be solved using an unconstrained optimization 
method: 

2xy = 3, x 2 — y = 2 

Complete two one-dimensional minimization steps using the univariate method starting 
from the origin. 

6.51 Consider the two equations 

7x 3 — lOx — y = 1, 8y 3 — lly -t-x = 1 

Formulate the problem as an unconstrained optimization problem and complete two steps 
of the Fletcher-Reeves method starting from the origin. 

6.52 Solve the equations 5xi + 3*2 = 1 and 4xi — 7*2 = 76 using the BFGS method with the 
starting point (0, 0). 

6.53 Indicate the number of one-dimensional steps required for the minimization of the function 
/ = x 2 + x 2 — 2xi — 4*2 + 5 according to each scheme: 

(a) Steepest descent method 

(b) Fletcher-Reeves method 

(c) DFP method 

(d) Newton’s method 

(e) Powell’s method 

(f) Random search method 

(g) BFGS method 

(h) Univariate method 
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6.54 


6.55 


6.56 

6.57 

6.58 

6.59 

6.60 

6.61 

6.62 

6.63 

6.64 

6.65 

6.66 
6.67 


Same as Problem 6.53 for the following function: 

/ = (* 2 - x ?) 2 + (1 — * t) 2 

Verify whether the following search directions are [A]-conjugate while minimizing the 
function 

/ = X\ — X 2 + 2x\ + 2 * 1*2 + X 2 



Solve the equations x\ + 2x2 + 3*3 = 14, *i — *2 + *3 = 1 , and 3*i — 2*2 + *3 = 2 
using Marquardt’s method of unconstrained minimization. Use the starting point 
Xi = {0, 0, 0} T . 

Apply the simplex method to minimize the function / given in Problem 6.20. Use the 
point (— 1 . 2 , 1 . 0 ) as the base point to generate an initial regular simplex of size 2 and go 
through three steps of reflection, expansion, and/or contraction. 

Write a computer program to implement Powell’s method using the golden section method 
of one-dimensional search. 

Write a computer program to implement the Davidon-Fletcher-Powell method using the 
cubic interpolation method of one-dimensional search. Use a finite-difference scheme to 
evaluate the gradient of the objective function. 

Write a computer program to implement the BFGS method using the cubic interpolation 
method of one-dimensional minimization. Use a finite-difference scheme to evaluate the 
gradient of the objective function. 

Write a computer program to implement the steepest descent method of unconstrained 
minimization with the direct root method of one-dimensional search. 

Write a computer program to implement the Marquardt method coupled with the direct 
root method of one-dimensional search. 

Find the minimum of the quadratic function given by Eq. (6.141) starting from the solution 
Xj = {0, 0} T using MATLAB. 

Find the minimum of the Powell’s quatic function given by Eq. (6.142) starting from the 
solution X 1 = {3, — 1,0, 1 } T using MATLAB. 

Find the minimum of the Fletcher and Powell’s helical valley function given by Eq. 
(6.143) starting from the solution X 1 = {—1, 0, 0} T using MATLAB. 

Find the minimum of the nonlinear function given by Eq. (6. 144) starting from the solution 
Xi = {0, 1, 2 } t using MATLAB. 

Find the minimum of the Wood’s function given by Eq. (6.149) starting from the solution 
Xi = {-3, -1, -3, -lj T using MATLAB. 
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Nonlinear Programming III 
Constrained Optimization 
Techniques 


7.1 INTRODUCTION 

This chapter deals with techniques that are applicable to the solution of the constrained 
optimization problem: 


Find X which minimizes /(X) 


subject to 

g/(X) < 0, j = 1,2, ... ,m 

h k (X) = 0, k=l,2,...,p (7.1) 

There are many techniques available for the solution of a constrained nonlinear pro- 
gramming problem. All the methods can be classified into two broad categories: direct 
methods and indirect methods, as shown in Table 7.1. In the direct methods , the con- 
straints are handled in an explicit manner, whereas in most of the indirect methods, the 
constrained problem is solved as a sequence of unconstrained minimization problems. 
We discuss in this chapter all the methods indicated in Table 7.1. 


7.2 CHARACTERISTICS OF A CONSTRAINED PROBLEM 

In the presence of constraints, an optimization problem may have the following features 
[7.1, 7.51]: 

1. The constraints may have no effect on the optimum point; that is, the constrained 
minimum is the same as the unconstrained minimum as shown in Fig. 7.1. In 
this case the minimum point X* can be found by making use of the necessary 
and sufficient conditions 


V/lx* =o 


J x* 


9 2 / 

dXj dx 


./Jx 


= positive definite 


(7.2) 

(7.3) 
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Table 7.1 Constrained Optimization Techniques 

Direct methods Indirect methods 


Random search methods 
Heuristic search methods 
Complex method 

Objective and constraint approximation 
methods 

Sequential linear programming method 
Sequential quadratic programming method 
Methods of feasible directions 
Zoutendijk’s method 
Rosen’s gradient projection method 
Generalized reduced gradient method 


Transformation of variables technique 
Sequential unconstrained minimization 
techniques 

Interior penalty function method 
Exterior penalty function method 
Augmented Lagrange multiplier method 


*2 



Figure 7.1 Constrained and unconstrained minima are the same (linear constraints). 


However, to use these conditions, one must be certain that the constraints are not 
going to have any effect on the minimum. For simple optimization problems like 
the one shown in Fig. 7.1, it may be possible to determine beforehand whether 
or not the constraints have an influence on the minimum point. However, in 
most practical problems, even if we have a situation as shown in Fig. 7.1, it will 
be extremely difficult to identify it. Thus one has to proceed with the general 
assumption that the constraints have some influence on the optimum point. 

2. The optimum (unique) solution occurs on a constraint boundary as shown in 
Fig. 7.2. In this case the Kuhn-Tucker necessary conditions indicate that the 
negative of the gradient must be expressible as a positive linear combination of 
the gradients of the active constraints. 
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Figure 7.2 Constrained minimum occurring on a nonlinear constraint. 

3. If the objective function has two or more unconstrained local minima, the con- 
strained problem may have multiple minima as shown in Fig. 7.3. 

4. In some cases, even if the objective function has a single unconstrained 
minimum, the constraints may introduce multiple local minima as shown in 
Fig. 7.4. 

A constrained optimization technique must be able to locate the minimum in all the 
situations outlined above. 


*2 



Figure 7.3 Relative minima introduced by objective function. 
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x 2 



X 


1 


Figure 7.4 Relative minima introduced by constraints. 



Direct Methods 


7.3 RANDOM SEARCH METHODS 

The random search methods described for unconstrained minimization (Section 6.2) 
can be used, with minor modifications, to solve a constrained optimization problem. 
The basic procedure can be described by the following steps: 

1. Generate a trial design vector using one random number for each design variable. 

2. Verify whether the constraints are satisfied at the trial design vector. Usually, 
the equality constraints are considered satisfied whenever their magnitudes lie 
within a specified tolerance. If any constraint is violated, continue generating 
new trial vectors until a trial vector that satisfies all the constraints is found. 

3. If all the constraints are satisfied, retain the current trial vector as the best 
design if it gives a reduced objective function value compared to the previous 
best available design. Otherwise, discard the current feasible trial vector and 
proceed to step 1 to generate a new trial design vector. 

4. The best design available at the end of generating a specified maximum number 
of trial design vectors is taken as the solution of the constrained optimization 
problem. 

It can be seen that several modifications can be made to the basic procedure indicated 
above. For example, after finding a feasible trial design vector, a feasible direction can 
be generated (using random numbers) and a one-dimensional search can be conducted 
along the feasible direction to find an improved feasible design vector. 
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Another procedure involves constructing an unconstrained function, /AX), by 
adding penalty for violating any constraint as (as described in Section 7.12): 

m p 

F(X) = /(X) + a [G/(X)] 2 + b £][//*(X)] 2 (7.4) 

7 = 1 k= 1 


where 

[G ; (X)] 2 = [max(0,g 7 (X))] 2 (7.5) 

[H k (X)f = h 2 k (X) (7.6) 

indicate the squares of violations of inequality and equality constraints, respectively, 
and a and b are constants. Equation (7.4) indicates that while minimizing the objective 
function /(X), a positive penalty is added whenever a constraint is violated, the penalty 
being proportional to the square of the amount of violation. The values of the constants 
a and b can be adjusted to change the contributions of the penalty terms relative to the 
magnitude of the objective function. 

Note that the random search methods are not efficient compared to the other meth- 
ods described in this chapter. However, they are very simple to program and usually 
are reliable in finding a nearly optimal solution with a sufficiently large number of 
trial vectors. Also, these methods can find near global optimal solution even when the 
feasible region is nonconvex. 


7.4 COMPLEX METHOD 

In 1965, Box extended the simplex method of unconstrained minimization (discussed 
in Section 6.7) to solve constrained minimization problems of the type [7.2]: 


subject to 


Minimize /(X) 

g/(X) < 0, j — 1,2, ... ,m 

(/) . - (j/) • i rx 

Xj<Xi<Xj, i = 1 , 2 , . . . , n 


(1.7a) 


(1.1b) 

(1.1c) 


In general, the satisfaction of the side constraints (lower and upper bounds on the 
variables x,) may not correspond to the satisfaction of the constraints gj(X) < 0. This 
method cannot handle nonlinear equality constraints. The formation of a sequence of 
geometric figures each having k — n + 1 vertices in an n -dimensional space (called 
the simplex) is the basic idea in the simplex method. In the complex method also, 
a sequence of geometric figures each having k >n + 1 vertices is formed to find the 
constrained minimum point. The method assumes that an initial feasible point X | (which 
satisfies all the m constraints) is available. 


Iterative Procedure 

1. Find k > n + 1 points, each of which satisfies all m constraints. In actual prac- 
tice, we start with only one feasible point Xi, and the remaining k — 1 points 
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are found one at a time by the use of random numbers generated in the range 
0 to 1, as 

Xij =x- l) +rij(x- u) -x\ l) ), i = 1, 2, . . . , n, j = 2, 3 ,k (7.8) 

where xtj is the / th component of the point X , , and j is a random number 
lying in the interval (0, 1). It is to be noted that the points X 2 ,X 3 , . . . ,X k 
generated according to Eq. (7.8) satisfy the side constraints, Eqs. (7.7c) but 
may not satisfy the constraints given by Eqs. (7 .lb). 

As soon as a new point X , is generated ( j =2,3, , k), we find whether 
it satisfies all the constraints, Eqs. (7.7 b). If X, violates any of the constraints 
stated in Eqs. (1.7b), the trial point X y is moved halfway toward the centroid 
of the remaining, already accepted points (where the given initial point Xi is 
included). The centroid Xo of already accepted points is given by 

1 ,, '~ 1 

Xo = — -Vx, (7.9) 

tT 

If the trial point X y so found still violates some of the constraints, Eqs. (7.7 b), 
the process of moving halfway in toward the centroid Xo is continued until 
a feasible point Xy is found. Ultimately, we will be able to find a feasible 
point Xy by this procedure provided that the feasible region is convex. By 
proceeding in this way, we will ultimately be able to find the required feasible 
points X 2 .X 3 , ...,X k . 

2. The objective function is evaluated at each of the k points (vertices). If the 
vertex X/, corresponds to the largest function value, the process of reflection is 
used to find a new point X, as 

X r = (l+a)X 0 -aX /! (7.10) 

where a > 1 (to start with) and Xo is the centroid of all vertices except X/,: 

X o-^E x / (7-lD 

1=1 

l^k 

3 . Since the problem is a constrained one, the point X, has to be tested for feasi- 
bility. If the point X,- is feasible and /(X, ) < /(X/,), the point X/, is replaced 
by X,-, and we go to step 2. If /(X, ) > /(X/,), a new trial point X,- is found 
by reducing the value of a in Eq. (7.10) by a factor of 2 and is tested for 
the satisfaction of the relation / (X , ) < / (X /, ) . If /(X,-) > /(X/,), the proce- 
dure of finding a new point X,. with a reduced value of a is repeated again. 
This procedure is repeated, if necessary, until the value of a becomes smaller 
than a prescribed small quantity e, say, 10 -6 . If an improved point X r , with 
f(X r ) <c /' ( X j, ) . cannot be obtained even with that small value of (/ . the point 
X, is discarded and the entire procedure of reflection is restarted by using the 
point X p (which has the second-highest function value) instead of X/,. 
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4. If at any stage, the reflected point X,- (found in step 3) violates any of the 
constraints [Eqs. (7.7/?)], it is moved halfway in toward the centroid until it 
becomes feasible, that is, 

(X r )new = 5(X 0 +X r ) (7.12) 

This method will progress toward the optimum point as long as the complex 
has not collapsed into its centroid. 

5. Each time the worst point X /j of the current complex is replaced by a new 
point, the complex gets modified and we have to test for the convergence of the 
process. We assume convergence of the process whenever the following two 
conditions are satisfied: 


(a) The complex shrinks to a specified small size (i.e., the distance between 
any two vertices among X i . X 2 , . . . , X / ; becomes smaller than a prescribed 
small quantity, ei. 

(b) The standard deviation of the function value becomes sufficiently small 
(i.e., when 


-j-£[/(X)-/(X,)] 2 


1/2 


< £2 


(7.13) 


where X is the centroid of all the k vertices of the current complex, and 
£2 > 0 is a specified small number). 


Discussion. This method does not require the derivatives of /(X) and gj(X) to find 
the minimum point, and hence it is computationally very simple. The method is very 
simple from programming point of view and does not require a large computer storage. 

1. A value of 1.3 for the initial value of a in Eq. (7.10) has been found to be 
satisfactory by Box. 

2. Box recommended a value of k ~ 2n (although a lesser value can be chosen 
if n is greater than, say, 5). If k is not sufficiently large, the complex tends to 
collapse and flatten along the first constraint boundary encountered. 

3. From the procedure above, it can be observed that the complex rolls over and 
over, normally expanding. However, if a boundary is encountered, the complex 
contracts and flattens itself. It can then roll along this constraint boundary and 
leave it if the contours change. The complex can also accommodate more than 
one boundary and can turn corners. 

4. If the feasible region is nonconvex, there is no guarantee that the centroid of all 
feasible points is also feasible. If the centroid is not feasible, we cannot apply 
the procedure above to find the new points X r . 

5. The method becomes inefficient rapidly as the number of variables increases. 

6 . It cannot be used to solve problems having equality constraints. 

7. This method requires an initial point X 1 that is feasible. This is not a major 
restriction. If an initial feasible point is not readily available, the method 
described in Section 7.13 can be used to find a feasible point Xi. 
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7.5 SEQUENTIAL LINEAR PROGRAMMING 

In the sequential linear programming (SLP) method, the solution of the original nonlin- 
ear programming problem is found by solving a series of linear programming problems. 
Each LP problem is generated by approximating the nonlinear objective and constraint 
functions using first-order Taylor series expansions about the current design vector, X , . 
The resulting LP problem is solved using the simplex method to find the new design 
vector X, + i. If X !+ i does not satisfy the stated convergence criteria, the problem is 
relinearized about the point X, + i and the procedure is continued until the optimum 
solution X* is found. 

If the problem is a convex programming problem, the linearized constraints always 
lie entirely outside the feasible region. Hence the optimum solution of the approximating 
LP problem, which lies at a vertex of the new feasible region, will lie outside the 
original feasible region. However, by relinearizing the problem about the new point 
and repeating the process, we can achieve convergence to the solution of the original 
problem in few iterations. The SLP method, also known as the cutting plane method, 
was originally presented by Cheney and Goldstein [7.3] and Kelly [7.4], 

Algorithm. The SLP algorithm can be stated as follows: 

1. Start with an initial point X i and set the iteration number as i = 1 . The point 
X i need not be feasible. 

2. Linearize the objective and constraint functions about the point X, as 

/(X)«/(X 1 ) + V/(X,) T (X-X,0 
g/(X) ~ g/(X,') + Vgj(X ( ) T (X — X,) 

MX) « MX,) + V^-(X/) T (X - X,) 

3. Lormulate the approximating linear programming problem as T 

Minimize /(X,) + V/ ; T (X — X,) 

subject to 

+ Vg ; (X,) r (X - X, ) <0, j = 1, 2, . . . , m 
AtCX.O + VA* (X,) r (X -X,) = 0, k = 1,2, p (7.17) 

4. Solve the approximating LP problem to obtain the solution vector X/+i. 

5. Evaluate the original constraints at X, + i; that is, find 

gj(X i+ i), j — 1,2, ... ,m and MX,-+i)> k = l,2,...,p 


(7.14) 

(7.15) 

(7.16) 


'''Notice that the LP problem stated in Eq. (7.17) may sometimes have an unbounded solution. This can be 
avoided by formulating the first approximating LP problem by considering only the following constraints: 

li < Xi < Ui, i = 1,2, . . . , n (7.18) 

In Eq. (7.18), U and «, represent the lower and upper bounds on x t , respectively. The values of and 
Ui depend on the problem under consideration, and their values have to be chosen such that the optimum 
solution of the original problem does not fall outside the range indicated by Eq. (7.18). 
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If gjQL i+i) ^ £ for j — 1> 2, . . . , m, and |fi&(X,_|_i)| < e, k — 1, 2, . . . , p, 
where e is a prescribed small positive tolerance, all the original constraints can 
be assumed to have been satisfied. Hence stop the procedure by taking 

Xopt — X /-(- 1 

If gj(Xj + i) > s for some j, or /?/TX, )| > e for some k, find the most violated 

constraint, for example, as 


g*(X,-+t) = max[g ; (X, +1 )] (7.19) 

j 

Relinearize the constraint g k (X) < 0 about the point X, + i as 

g k (X) ~ ft(X,- +1 ) + Vg,(X ;+1 ) T (X - X, +1 ) < 0 (7.20) 

and add this as the (m + l)th inequality constraint to the previous LP problem. 
6. Set the new iteration number as i = i + 1, the total number of constraints in 
the new approximating LP problem as m + 1 inequalities and p equalities, and 
go to step 4. 

The sequential linear programming method has several advantages: 

1. It is an efficient technique for solving convex programming problems with 
nearly linear objective and constraint functions. 

2 . Each of the approximating problems will be a LP problem and hence can be 
solved quite efficiently. Moreover, any two consecutive approximating LP prob- 
lems differ by only one constraint, and hence the dual simplex method can be 
used to solve the sequence of approximating LP problems much more effi- 
ciently.^ 

3 . The method can easily be extended to solve integer programming problems. In 
this case, one integer LP problem has to be solved in each stage. 


Geometric Interpretation of the Method. The SLP method can be illustrated with 
the help of a one-variable problem: 

Minimize f (x ) = c\x 


subject to 


g(x) < 0 


(7.21) 


where c\ is a constant and g (x ) is a nonlinear function of x. Let the feasible region and 
the contour of the objective function be as shown in Fig. 7.5. To avoid any possibility 
of unbounded solution, let us first take the constraints on x as c < x < d, where c and 
d represent the lower and upper bounds on x. With these constraints, we formulate the 
LP problem: 


Minimize f(x) — c\x 


+ The dual simplex method was discussed in Section 4.3. 
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subject to 


c < x < d 


(7.22) 


The optimum solution of this approximating LP problem can be seen to be x* = c. 
Next, we linearize the constraint g(x) about point c and add it to the previous constraint 
set. Thus the new LP problem becomes 


Minimize f(x ) = c\x 


(7.23a) 


subject to 


c < x < cl 


{1.23b) 


g(c) + ^r(c)(x ~ c) < 0 
dx 


(7.23c) 


The feasible region of x, according to the constraints (1.23b) and (7.23c), is given by 
e < x < d (Fig. 7.6). The optimum solution of the approximating LP problem given 
by Eqs. (7.23) can be seen to be x* = e. Next, we linearize the constraint g(x) < 0 
about the current solution x* — e and add it to the previous constraint set to obtain the 
next approximating LP problem as 


Minimize f(x) — c\x 


(7.24a) 


subject to 


c < x < d 


(1.24b) 
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A 



g(c) + -±(c)(x-c)< 0 (7.24c) 

ax 

g(e) + ^(e)(x-e)< 0 (7.24 d) 

ax 

The permissible range of x, according to the constraints (7 . 24/?) , (7.24c), and (7 ,24d), 
can be seen to be / < x < d from Fig. 7.7. The optimum solution of the LP problem 
of Eqs. (7.24) can be obtained as x* — f. 

We then linearize g(x) <0 about the present point x* — f and add it to the 
previous constraint set [Eqs. (7.24)] to debne a new approximating LP problem. This 
procedure has to be continued until the optimum solution is found to the desired level of 
accuracy. As can be seen from Figs. 7.6 and 7.7, the optimum of all the approximating 
LP problems (e.g., points c, e, f, . . .) lie outside the feasible region and converge toward 
the true optimum point, x — a. The process is assumed to have converged whenever 
the solution of an approximating problem satisfies the original constraint within some 
specified tolerance level as 

8 ( 4 ) < e 

where e is a small positive number and x | is the optimum solution of the Z:th approx- 
imating LP problem. It can be seen that the lines (hyperplanes in a general problem) 
defined by g(xp + dg / dx(x* k ){x ~ x* k ) cut off a portion of the existing feasible region. 
Flence this method is called the cutting plane method. 
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A 



Example 7.1 

Minimize f(pc i, * 2 ) — x \ ~ x 2 

subject to 

glOl, Xl) — 'ix] - 2 x\X 2 + x\ - 1 < 0 

using the cutting plane method. Take the convergence limit in step 5 as £ = 0.02. 

Note: This example was originally given by Kelly [7.4]. Since the constraint 
boundary represents an ellipse, the problem is a convex programming problem. From 
graphical representation, the optimum solution of the problem can be identified as 
x* = 0, x\ = 1, and f min - -1. 


SOLUTION 

Steps 1, 2, 3: Although we can start the solution from any initial point X 1 , to avoid 
the possible unbounded solution, we first take the bounds on x\ and X 2 
as — 2 < xi <2 and — 2 < X 2 < 2 and solve the following LP problem: 


Minimize f — x\ — x 2 
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subject to 


— 2 < xi <2 

— 2 < x 2 < 2 


(Hi) 


The solution of this problem can be obtained as 


X = 


-2 

2 


with /(X) = —4 


Step 4: Since we have solved one LP problem, we can take 

X i+ i =X 2 = 

Step 5: Since (X 2) =23 > e, we linearize gi(X) about point X? as 

gt(X) — gi(X 2 ) + Vgi(X 2 ) T (X — X2) < 0 


(E 2 ) 


As 


gi(X 2 ) = 23, 


9gi 

dx\ 


= (6x1 -2x 2 )|x, = -16 

x 2 


— = (-2X! + 2x 2 )|x, = 8 

9*2 x 2 


Eq. (E 2 ) becomes 


gi(X) ~ — 16xi + 8x 2 — 25 < 0 

By adding this constraint to the previous LP problem, the new LP prob- 
lem becomes 

Minimize / = x 1 — x 2 


subject to 


—2 < x\ < 2 


—2 < x 2 < 2 
— 16xi + 8x 2 — 25 < 0 


(E 3 ) 


Step 6: Set the iteration number as i — 2 and go to step 4. 

Step 4: Solve the approximating LP problem stated in Eqs. (E3) and obtain the 
solution 


| -0.5625 

[ 2.0 


with / 3 = /(X 3 ) = -2.5625 


This procedure is continued until the specified convergence criterion, 
Si (X,) < e, in step 5 is satisfied. The computational results are summa- 
rized in Table 7.2. 
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Table 7.2 Results for Example 7.1 


Iteration 

number, 

i 

New linearized 
constraint 
considered 

Solution of the 
approximating LP 
problem 

X i+ i 

/(X,+i) 

gi(X,+i) 

1 

— 2 < x\ < 2 and 
-2 < x 2 < 2 

(-2.0, 2.0) 

-4.00000 

23.00000 

2 

— 16.0x1 + 8.O.X2 — 25.0 < 0 

(-0.56250, 2.00000) 

-2.56250 

6.19922 

3 

-7.375x1 +5.125x2 
-8.19922 < 0 

(0.27870, 2.00000) 

-1.72193 

2.11978 

4 

-2.33157x1+3.44386x2 
-4.11958 < 0 

(-0.52970, 0.83759) 

-1.36730 

1.43067 

5 

-4.85341.xi + 2.73459x2 
-3.43067 < 0 

(-0.05314, 1.16024) 

-1.21338 

0.47793 

6 

-2.63930x1 + 2.42675x2 
-2.47792 < 0 

(0.42655, 1.48490) 

-1.05845 

0.48419 

7 

-0.41071x1+2.11690x2 
-2.48420 < 0 

(0.17058, 1.20660) 

-1.03603 

0.13154 

8 

-1.38975x1+2.07205x2 
-2.13155 < 0 

(0.01829, 1.04098) 

-1.02269 

0.04656 

9 

-1.97223x1 +2.04538x2 
-2.04657 < 0 

(-0.16626,0.84027) 

-1.00653 

0.06838 

10 

-2.67809x1+2.01305x2 
-2.06838 < 0 

(-0.07348, 0.92972) 

-1.00321 

0.01723 


7.6 BASIC APPROACH IN THE METHODS OF FEASIBLE 
DIRECTIONS 

In the methods of feasible directions, basically we choose a starting point satisfying all 
the constraints and move to a better point according to the iterative scheme 

X; +1 = X, + AS; (7.25) 

where X, is the starting point for the ;th iteration, S, the direction of movement, X 
the distance of movement (step length), and X (+ i the final point obtained at the end 
of the ith iteration. The value of X is always chosen so that X, + i lies in the feasible 
region. The search direction S, is found such that (1) a small move in that direction 
violates no constraint, and (2) the value of the objective function can be reduced in 
that direction. The new point X i+ i is taken as the starting point for the next iteration 
and the entire procedure is repeated several times until a point is obtained such that 
no direction satisfying both properties 1 and 2 can be found. In general, such a point 
denotes the constrained local minimum of the problem. This local minimum need not 
be a global one unless the problem is a convex programming problem. A direction 
satisfying property 1 is called feasible while a direction satisfying both properties 1 
and 2 is called a usable feasible direction. This is the reason that these methods are 
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known as methods of feasible directions. There are many ways of choosing usable 
feasible directions, and hence there are many different methods of feasible directions. 
As seen in Chapter 2, a direction S is feasible at a point X, if it satisfies the relation 

= S T Vg / (X ; ) <0 (7.26) 

a constraint is linear or strictly concave, 
usable feasible direction if it satisfies the 


= S T V/(X ( ) < 0 (7.27) 

= S T Vg;(X,-) < 0 (7.28) 

It is possible to reduce the value of the objective function at least by a small amount 
by taking a step length X > 0 along such a direction. 

The detailed iterative procedure of the methods of feasible directions will be con- 
sidered in terms of two well-known methods: Zoutendijk’s method of feasible directions 
and Rosen’s gradient projection method. 


d 

dX 


gjfri + ^S)U =0 


where the equality sign holds true only if 
as shown in Fig. 2.8. A vector S will be a 
relations 

4-fdi+X S)U=o 
dX 


-jr gjO^i + ^S)U=o 


7.7 ZOUTE NDIJ K'S METHOD OF FEASIBLE DIRECTIONS 

In Zoutendijk’s method of feasible directions, the usable feasible direction is taken as 
the negative of the gradient direction if the initial point of the iteration lies in the 
interior (not on the boundary) of the feasible region. However, if the initial point 
lies on the boundary of the feasible region, some constraints will be active and the 
usable feasible direction is found so as to satisfy Eqs. (7.27) and (7.28). The iterative 
procedure of Zoutendijk’s method can be stated as follows (only inequality constraints 
are considered in Eq. (7.1), for simplicity. 

Algorithm 

1. Start with an initial feasible point X] and small numbers gj, £ 2 , and £3 to test 
the convergence of the method. Evaluate /(X 1 ) and g ; (X 1 ), j — 1,2,..., m. 
Set the iteration number as i = 1. 

2. If gjfri) < 0, j = 1, 2, . . . , m (i.e., X, is an interior feasible point), set the 
current search direction as 

S , = — V/(Xj) (7.29) 

Normalize S, in a suitable manner and go to step 5. If at least one gj (X ,■ ) = 0, 
go to step 3. 

3. Find a usable feasible direction S by solving the direction-finding problem: 


Minimize — a 


(7.30a) 
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subject to 


s T vg ; (X;) + 0 jd < o, 

j — 1,2,...,/? 

(7.30b) 

S T V/ + a < 0 


(7.30 c) 

-!<£,<!, 

i = 1,2 , ,n 

(7.30 d) 


where ,s ( - is the z'th component of S, the first p constraints have been assumed 
to be active at the point X,- (the constraints can always be renumbered to satisfy 
this requirement), and the values of all 6j can be taken as unity. Here a can be 
taken as an additional design variable. 

4. If the value of a* found in step 3 is very nearly equal to zero, that is, if a * < £i, 
terminate the computation by taking X opt ~ X,-. If a* > si, go to step 5 by taking 

Si = s. 

5. Find a suitable step length A, along the direction S, and obtain a new point 
X(+i as 


Xj+l — X, + A,S; 


(7.31) 


The methods of finding the step length A,- will be considered later. 

6. Evaluate the objective function /(X, + i). 

7. Test for the convergence of the method. If 


/(X,)-/(X, +1 ) 

/(X,) 


< £2 and 


IIX; 


X/+i|| <£3 


(7.32) 


terminate the iteration by taking X opt ~ X , , | . Otherwise, go to step 8. 

8 . Set the new iteration number as i = i + 1, and repeat from step 2 onward. 


There are several points to be considered in applying this algorithm. These are 
related to (1) finding an appropriate usable feasible direction (S), (2) finding a suitable 
step size along the direction S, and (3) speeding up the convergence of the process. 
All these aspects are discussed below. 


7.7.1 Direction-Finding Problem 

If the point X,- lies in the interior of the feasible region [i.e., g ; (X,) < 0 for j — 
1,2, ... , m], the usable feasible direction is taken as 

s, - — V/(X,-) (7.33) 

The problem becomes complicated if one or more of the constraints are critically 
satisfied at X,, that is, when some of the g ; (X,) = 0. One simple way to find a usable 
feasible direction at a point X, at which some of the constraints are active is to generate 
a random vector and verify whether it satisfies Eqs. (7.27) and (7.28). This approach 
is a crude one but is very simple and easy to program. The relations to be checked for 
each random vector are also simple, and hence it will not require much computer time. 
However, a more systematic procedure is generally adopted to find a usable feasible 
direction in practice. Since there will be, in general, several directions that satisfy 
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Eqs. (7.27) and (7.28), one would naturally be tempted to choose the “best” possible 
usable feasible direction at X,. 

Thus we seek to find a feasible direction that, in addition to decreasing the value 
of /, also points away from the boundaries of the active nonlinear constraints. Such a 
direction can be found by solving the following optimization problem. Given the point 
X ( , find the vector S and the scalar a that maximize a subject to the constraints 

S T Vg ; (X ; ) + 9 jet < 0, jeJ (7.34) 

S T V/(X ; ) -fct < 0 (7.35) 

where J represents the set of active constraints and S is normalized by one of the 
following relations: 


S T S = £,,? = 1 (7-36) 

1 = 1 

-l<Ji<l, / = 1,2, ...,« (7.37) 

S T V/(X,) < 1 (7.38) 


In this problem, 0j are arbitrary positive scalar constants, and for simplicity, we can 
take all 0j — 1. Any solution of this problem with a >0 is a usable feasible direction. 
The maximum value of a gives the best direction (S) that makes the value of S T V J) 
negative and the values of S T Vg / (X,) as negative as possible simultaneously. In other 
words, the maximum value of a makes the direction S steer away from the active 
nonlinear constraint boundaries. It can easily be seen that by giving different values for 
different Gj, we can give more importance to certain constraint boundaries compared to 
others. Equations (7.36) to (7.38) represent the normalization of the vector S so as to 
ensure that the maximum of a will be a finite quantity. If the normalization condition 
is not included, the maximum of a may be made to approach oc without violating the 
constraints [Eqs. (7.34) and (7.35)]. 

Notice that the objective function a, and the constraint equations (7.34) and (7.35) 
are linear in terms of the variables jj, S 2 , . . . , s n , a. The normalization constraint will 
also be linear if we use either Eq. (7.37) or (7.38). However, if we use Eq. (7.36) 
for normalization, it will be a quadratic function. Thus the direction-finding problem 
can be posed as a linear programming problem by using either Eq. (7.37) or (7.38) 
for normalization. Even otherwise, the problem will be a LP problem except for one 
quadratic constraint. It was shown by Zoutendijk [7.5] that this problem can be han- 
dled by a modified version of linear programming. Thus the direction-finding problem 
can be solved with reasonable efficiency. We use Eq. (7.37) in our presentation. The 
direction-finding problem can be stated more explicitly as 


Minimize — a 


9gl , dg i 
^ 1 — +^ 2 — + 
OX 1 OX 2 


dg i 

+ s n - \-6\a < 0 

OX n 


subject to 
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dg2 , dg 2 , 

Si h S 2 h 

9xi 9 x 2 


+ S n — h 02 oi < 0 

ox n 


d gp , 9 <?P . . dgp . a ^r. 

Si — h s 2 — h • • • + s„ — h 0pQ? < 0 


9xi 


9x 2 


9x„ 


9/ 9/ 9/ 

si- hs 2 - 1 h s„ — ha < 0 

C/Xj 0X2 oXn 


si — 1 < 0 


S2 - 1 < 0 


(7.39) 


s„ — 1 < 0 

— 1 — si < 0 
- 1 - s 2 < 0 

— 1 — s„ < 0 


where p is the number of active constraints and the partial derivatives dg\/dx \ , dg\/dx 2 , 
. . . , dgp/dx n , df/dx i, . . . , 9//9x„ have been evaluated at point X,. Since the com- 
ponents of the search direction, s,-, i = 1 to n, can take any value between —1 and 1, 
we define new variables t, as f, = s,- + 1, i = 1 to n, so that the variables will always 
be nonnegative. With this change of variables, the problem above can be restated as 
a standard linear programming problem as follows: 


subject to 


Find (fi, t 2 , y i, y 2 , • • • , y P +n+ 1 ) which 

minimizes — a 


+l2 a Ji + 

dxi dX2 


,, s J2 + h s Jl + 

9xi 9x2 


hiTp- + 9\a + y\ — ^ 
dx n j=1 




dg 1 
9x„ 

dg2 

dx n 


■6 2 a+y 2 = ^2 


i = 1 


dg 1 

dxi 

dg2 

dxj 


9xi 'dX2 


+ "^ + >'-±1 7 
1 = 1 


(7.40) 
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9/ 9/ 

h— + h — + ■■ 

9xi 9 x 2 


9/ 

+ h a + Jp+i 

9x„ 


= £ 


3/ 

9x ( - 


6 + Tp+2 — 2 

+ y P +3 = 2 


6, “T Tp+n + l — 2 
?1 > 0 
t 2 >0 


t n > 0 
a > 0 

where jt, y 2 , • • • , yp+n+i are the nonnegative slack variables. The simplex method 
discussed in Chapter 3 can be used to solve the direction-finding problem stated in 
Eqs. (7.40). This problem can also be solved by more sophisticated methods that treat 
the upper bounds on t, in a special manner instead of treating them as constraints 
[7.6]. If the solution of the direction-finding problem gives a value of a* >0, /(X) 
can be improved by moving along the usable feasible direction 


^1 


' t * - 1 

^2 


*2 - 1 






• 1 


If, however, a* — 0, it can be shown that the Kuhn-Tucker optimality conditions are 
satisfied at X, and hence point X, can be taken as the optimal solution. 

7.7.2 Determination of Step Length 

After finding a usable feasible direction S, at any point X,, we have to determine a 
suitable step length A.,- to obtain the next point X, + i as 

Xi+^Xi+kiSi (7.41) 

There are several ways of computing the step length. One of the methods is to determine 
an optimal step length (A.,) that minimizes /(X, + AS,) such that the new point X, + i 
given by Eq. (7.41) lies in the feasible region. Another method is to choose the step 
length (A,) by trial and error so that it satisfies the relations 

/(X; + A,-Sj) < /(X,) 

gj(X.i + A, S,) <0, j = 1, 2, . . . , m 


(7.42) 
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Method 1. The optimal step length, A,- , can be found by any of the one-dimensional 
minimization methods described in Chapter 5. The only drawback with these methods 
is that the constraints will not be considered while finding A,-. Thus the new point 
X (+ i = X,- + A/S i may lie either in the interior of the feasible region (Fig. 7.8a), or on 
the boundary of the feasible region (Fig. 7.8 b), or in the infeasible region (Fig. 7.8c). 

If the point X i+ i lies in the interior of the feasible region, there are no active 
constraints and hence we proceed to the next iteration by setting the new usable feasible 
direction as S (+ i = — V f(X i+i ) (i.e., we go to step 2 of the algorithm). On the other 
hand, if X,- + | lies on the boundary of the feasible region, we generate a new usable 
feasible direction S = S, + i by solving a new direction-finding problem (i.e., we go to 
step 3 of the algorithm). One practical difficulty has to be noted at this stage. To detect 
that point X/ +] is lying on the constraint boundary, we have to find whether one or 
more g ; (X, + i) are zero. Since the computations are done numerically, will we say that 


Direction in 




Figure 7.8 Effect of taking optimal step length. 
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the constraint gj is active if gj (X, + | ) = 10 2 . 10“ 3 , 10 -8 , and so on? We immediately 
notice that a small margin e has to be specified to detect an active constraint. Thus we 
can accept a point X to be lying on the constraint boundary if gj (X ) | < e where e is 
a prescribed small number. If point X, + i lies in the infeasible region, the step length 
has to be reduced (corrected) so that the resulting point lies in the feasible region only. 
It is to be noted that an initial trial step size (ei) has to be specified to initiate the 
one-dimensional minimization process. 

Method 2. Even if we do not want to find the optimal step length, some sort of 
a trial-and-error method has to be adopted to find the step length A; so as to satisfy 
the relations (7.42). One possible method is to choose an arbitrary step length e and 
compute the values of 

/ = /(X; + sSj) and gj = g ; (X,- + eS,) 

Depending on the values of / and gj, we may need to adjust the value of s until we 
improve the objective function value without violating the constraints. 

Initial Trial Step Length. It can be seen that in whatever way we want to find the 
step size A;, we need to specify an initial trial step length e. The value of e can be 
chosen in several ways. Some of the possibilities are given below. 

1. The average of the final step lengths A, obtained in the last few iterations can 
be used as the initial trial step length e for the next step. Although this method 
is often satisfactory, it has a number of disadvantages: 

(a) This method cannot be adopted for the first iteration. 

(b) This method cannot take care of the rate of variation of /(X) in different 
directions. 

(c) This method is quite insensitive to rapid changes in the step length that take 
place generally as the optimum point is approached. 

2 . At each stage, an initial step length e is calculated so as to reduce the objective 

function value by a given percentage. For doing this, we can approximate the 
behavior of the function /(A) to be linear in A. Thus if 


/ (X,) = /(A = 0) = /r 


df „ df „ 

i 7(X,-) = -j-iXi+XSj) 

CIA CIA 


= s r vf t = /; 


^=0 


are known to us, the linear approximation of /(A) is given by 

/(A) ~ h + /;* 


(7.43) 

(7.44) 


To obtain a reduction of S% in the objective function value compared to |/i|, 
the step length A = e is given by 
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that is, 


J_L/il 

100 fi 


(7.45) 


It is to be noted that the value of e will always be positive since f[ given in 
Eq. (7.44) is always negative. This method yields good results if the percentage 
reduction (5) is restricted to small values on the order of 1 to 5. 


7.7.3 Termination Criteria 

In steps 4 and 5 of the algorithm, the optimization procedure is assumed to have 
converged whenever the maximum value of a (a*) becomes approximately zero and 
the results of the current iteration satisfy the relations stated in Eq. (7.32). In addition, 
one can always test the Kuhn-Tucker necessary conditions before terminating the 
procedure. 

However, we can show that if the Kuhn-Tucker conditions are satisfied, the value 
of a* will become zero. The Kuhn-Tucker conditions are given by 

p 

Vf + J2^jVgj= 0 (7.46) 

7=1 

kj > 0, =1,2, ...,p (7.47) 

where the first p constraints are assumed to be the active constraints. Equation (7.46) 
gives 

p 

sTv / = - E ^ sTv ^ >0 < 7 - 48) 

7=1 

if S is a usable feasible direction. Thus if the Kuhn-Tucker conditions are satisfied at 
a point X,, we will not be able to find any search direction S that satisfies the strict 
inequalities in the relations 

S T Vg ; - < 0, j = l,2,...,p 

S T V/ < 0 (7.49) 

However, these relations can be satisfied with strict equality sign by taking the trivial 
solution S = 0, which means that the value of a* in the direction-finding problem, 
Eqs. (7.39), is zero. Some modifications and accelerating techniques have been sug- 
gested to improve the convergence of the algorithm presented in this section and the 
details can be found in Refs. [7.7] and [7.8]. 

Example 7.2 

Minimize f(x\, xi) — x\ + x\ — 4x\ — 4x2 + 8 


subject to 


gl(xi,X 2 ) = X\ + 2X2 - 4 < 0 


with the starting point Xi = {|J}. Take £i = 0.001, £o = 0.001, and £3 = 0.01. 
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SOLUTION 

Step 1: At X i = {|j}: 


/(Xi)=8 and gi(Xi) = -4 


Iteration 1 

Step 2: Since gi(Xi) <0, we take the search direction as 


Si = -V/(X0 = 


| df/dxi 
Uf/dx 2 


This can be normalized to obtain Si = {[}. 

Step 5: To find the new point X 2 , we have to find a suitable step length along Si. For 
this, we choose to minimize /(X] + k.Si) with respect to a. Here 

/(X 1 +kSi) = /(0 + k,0 + k) = 2k 2 -8k + 8 

— = 0 at k = 2 
dk 

Thus the new point is given by X 2 = j 2 } and g\ (X 2 1 = 2. As the constraint is 
violated, the step size has to be corrected. 

As g\ = giU=o = —4 and g" = giU =2 = 2, linear interpolation gives the 
new step length as 


k = 


S\ ~ Si 3 


This gives gi| A= x = 0 and hence X 2 = 


Step 6: /(X 2 ) = §. 
Step 7: Here 


/(Xi)-/(X 2 ) 


8-1 

/(X,) 


8 


8 

= 9 >£2 


IIX 1 - X 2 || = [(0 - |) 2 + (0 - f) 2 ] 1 ^ = 1 .887 > e 2 
and hence the convergence criteria are not satisfied. 


Iteration 2 

Step 2: As g\ — 0 at X 2 , we proceed to find a usable feasible direction. 
Step 3: The direction-finding problem can be stated as [Eqs. (7.40)]: 


Minimize / = —a 
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subject to 


h + 2t 2 + a + yi 

-f ?1 — + 01 + V2 

ft + /3 
h + V4 
ft 
h 
a 


= 3 

_ 8 

3 

= 2 
= 2 
> 0 
> 0 
> o 


where y \ to V4 are the nonnegative slack variables. Since an initial basic feasible 
solution is not readily available, we introduce an artificial variable ys > 0 into 
the second constraint equation. By adding the infeasibility form w = y^, the 
LP problem can be solved to obtain the solution: 


+* o 3 

l \ ~ z ’ 2 ~ To’ 


>4 = 15. yr = y| = y|=o 


r * 4 

/min — & — jq 


As a* > 0, the usable feasible direction is given by 



Step 4: Since a* > £1, we go to the next step. 

Step 5: We have to move along the direction S2 = { _q 7} fr° m the point Xo — { [ / 1 j . 
To find the minimizing step length, we minimize 


f(X 2 + AS 2 ) = /( 1.333 + X, 1.333 - 0.7A.) 
= 1.49A 2 -0.4A + 0.889 


As df/dX — 2.98A. — 0.4 = 0 at X = 0.134, the new point is given by 


x 3 = x 2 + zs 2 


\ 1.333 


1.0 

1 1.467 

jl.333 

+ 0.134 • 

-0.7 

- j 1.239 


At this point, the constraint is satisfied since gi(X 3 ) = —0.055. Since point X 3 
lies in the interior of the feasible domain, we go to step 2. 

The procedure is continued until the optimum point X* = { [ and / m ; n = 0.8 
are obtained. 
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7.8 ROSEN'S GRADIENT PROJECTION METHOD 

The gradient projection method of Rosen [7.9, 7.10] does not require the solution of an 
auxiliary linear optimization problem to find the usable feasible direction. It uses the 
projection of the negative of the objective function gradient onto the constraints that 
are currently active. Although the method has been described by Rosen for a general 
nonlinear programming problem, its effectiveness is confined primarily to problems in 
which the constraints are all linear. Consider a problem with linear constraints: 

Minimize /(X) 


subject to 

n 

gj (X) = Yw - bj < 0, j — 1,2, , m (7.50) 

i = 1 

Let the indices of the active constraints at any point be j) , j 2 , . . . , j p . The gradients of 
the active constraints are given by 


Vg;(X) 


a M 

a 2j 


J — JU J2, ■ ■ ■ , Jp 


-‘nj 


By defining a matrix N of order n x p as 


(7.51) 


N = [V*;iV*; 2 ...V* ;> ] (7.52) 

the direction-finding problem for obtaining a usable feasible direction S can be posed 
as follows. 


Find S which minimizes S 7 V/(X ) 


(7.53) 


subject to 


N r S = 0 
S r S- 1 =0 


(7.54) 

(7.55) 


where Eq. (7.55) denotes the normalization of the vector S. To solve this 
equality-constrained problem, we construct the Lagrangian function as 

L(S, A., 0) = S T V/(X) + A t N t S + yS(S T S - 1) (7.56) 


A, 

A2 


A p 


where 
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is the vector of Lagrange multipliers associated with Eqs. (7.54) and ft is the Lagrange 
multiplier associated with Eq. (7.55). The necessary conditions for the minimum are 
given by 



= V/(X) + UX + 2ftS = 0 

(7.57) 


— = N T S = 0 

dX 

(7.58) 


— = s T s - 1 = 0 

dft 

(7.59) 

Equation (7.57) gives 

s = -±-(vf + tiX) 
2ft 

(7.60) 

Substitution of Eq. (7.60) into Eq. (7.58) gives 



N t S = — -(N T V/ + N t NL) = 0 
2ft 

(7.61) 

If S is normalized according to Eq. (7.59), ft will not be zero. 

and hence Eq. (7.61) 

gives 

n t v/ + n t na. = o 

(7.62) 

from which X can be found as 



X = -(N T Nr 1 N T V/ 

(7.63) 


This equation, when substituted in Eq. (7.60), gives 

S = --Ul - N(N t N)-'N t )V/ - --U V/ (7.64) 

zp zp 

where 


P = I - N(N t N)-‘N t (7.65) 

is called the projection matrix. Disregarding the scaling constant 2ft, we can say that 
the matrix P projects the vector — V/(X) onto the intersection of all the hyperplanes 
perpendicular to the vectors 


j = 71.72. 

We assume that the constraints g ; (X) are independent so that the columns of the 
matrix N will be linearly independent, and hence N T N will be nonsingular and can be 
inverted. The vector S can be normalized [without having to know the value of ft in 
Eq. (7.64)] as 

e P V/ 


IIP V/ 


(7.66) 
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If X, is the starting point for the ith iteration (at which gj i , gj 2 , . . . , gj P are critically 
satisfied), we find S, from Eq. (7.66) as 


P ,V/(X,) 
l|P»V/(X/)[| 


(7.67) 


where P, indicates the projection matrix P evaluated at the point X, . If S, / 0, we start 
from X, and move along the direction S, to find a new point X, +] according to the 
familiar relation 


X,+i = X,- + a,S,- (7.68) 

where A.,- is the step length along the search direction S, . The computational details for 
calculating /,,■ will be considered later. However, if S, = 0, we have from Eqs. (7.64) 
and (7.63), 


— V/(X,) = fix = A,Vg 7i + X 2 Vg j2 + • • • + X p Vgj p (7.69) 


where 


X = -(N t N) _1 N t V/(X,) 


(7.70) 


Equation (7.69) denotes that the negative of the gradient of the objective function is 
given by a linear combination of the gradients of the active constraints at X, . Further, if 
all Xj, given by Eq. (7.63), are nonnegative, the Kuhn-Tucker conditions [Eqs. (7.46) 
and (7.47) will be satisfied and hence the procedure can be terminated. 

However, if some Xj are negative and S, = 0, Eq. (7.69) indicates that some 
constraint normals V gj make an obtuse angle with — V/ at X,. This also means 
that the constraints gj, for which Xj are negative, are active at X, but should not be 
considered in finding a new search direction S that will be both feasible and usable. (If 
we consider all of them, the search direction S comes out to be zero.) This is illustrated 
in Fig. 7.9, where the constraint normal Vgj (X,) should not be considered in finding 
a usable feasible direction S at point X,. 

In actual practice we do not discard all the active constraints for which Xj are 
negative in forming the matrix N. Rather, we delete only one active constraint that 
corresponds to the most negative value of Xj. That is, the new N matrix is taken as 

Nnew = [Vg,'l Vgj 2 ••• Vgjq-1 ^gjq + 1 V g jq+2 ■■■ Vg jp ] (7.71) 


where Vgjq is dropped from N by assuming that X q is most negative among Xj obtained 
from Eq. (7.63). The new projection matrix is formed, by dropping the constraint 

§iq ’ as 

Pnew = (I - N new (N^ ew N n ew)- | N^ ew ) (7.72) 

and the new search direction (S ( ) new as 


(S/) 


new — 


P new V f (Xj) 

l|PnewV/(Xi)|| 


(7.73) 
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gl = 0 



and this vector will be a nonzero vector in view of the new computations we have 
made. The new approximation X,+i is found as usual by using Eq. (7.68). At the 
new point X i+ i, a new constraint may become active (in Fig. 7.9, the constraint 
becomes active at the new point X, + i). In such a case, the new active constraint 
has to be added to the set of active constraints to find the new projection matrix 
at X /+1 . 

We shall now consider the computational details for computing the step length a,- 
in Eq. (7.68). 

7.8.1 Determination of Step Length 

The step length X t in Eq. (7.68) may be taken as the minimizing step length X* along 
the direction S that is, 


f(Xi + X*S i) = min /(X,- + aS,) 

However, this minimizing step length X* may give the point 

X,- +1 = X/ + X*S t 


(7.74) 
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that lies outside the feasible region. Hence the following procedure is generally adopted 
to find a suitable step length A.,-. Since the constraints gj(K) are linear, we have 

n 

gj(X) = gj&i + AS,) = ^2,aij{xi + As,-) - bj 

i=i 

n n 

— a ‘j x i — bj + A ^ cijjSj 
i = 1 i=l 

n 

= gj (Xj) + A aijSi , j = 1, 2, . . . , m (7.75) 

1 = 1 

where 



X\ 


Sl 

X, - 

X2 

and S, = 

S2 






This equation shows that g ; (A) will also be a linear function of A. Thus if a particular 
constraint, say the /dh, is not active at X,-, it can be made to become active at the point 
X,- + A/-S by taking a step length A* where 


that is, 


n 

gkikk) = gk (X,) + Xk^aikSi = 0 
1 = 1 


A* = - 


gkO^-i) 

E ll 

i = 1 ^ik^i 


(7.76) 


Since the Ath constraint is not active at X,-, the value of g^(X,) will be negative and 
hence the sign of A*, will be same as that of the quantity ( Y^l= i a > k s i)- From Eqs. (7.75) 
we have 


= < 7 - 77 ) 
1=1 

and hence the sign of A* depends on the rate of change of gk with respect to A. If 
this rate of change is negative, we will be moving away from the Ath constraint in the 
positive direction of A. However, if the rate of change ( dg k /dX ) is positive, we will be 
violating the constraint gk if we take any step length A larger than A*. Thus to avoid 
violation of any constraint, we have to take the step length (A M ) as 

X M — min (Xk) (7.78) 

Xfc > 0 and k 
is any integer among 
lto m other than 

i\y Jl’ -’jp 

In some cases, the function /(A) may have its minimum along the line S; in 
between A = 0 and X — X m- Such a situation can be detected by calculating the 
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value of 

^j=SjVf(k) at A = A M 

If the minimum value of A, A*, lies in between A = 0 and \ = \ M , the quantity 
df/d/.O.M ) will be positive. In such a case we can find the minimizing step length A* 
by interpolation or by using any of the techniques discussed in Chapter 5. 

An important point to be noted is that if the step length is given by A,- (not by 
A*), at least one more constraint will be active at X,-+i than at X,. These additional 
constraints will have to be considered in generating the projection matrix at X,-+i. On 
the other hand, if the step length is given by A*, no new constraint will be active at 
X i+ i, and hence the projection matrix at X, + i involves only those constraints that were 
active at X,. 

Algorithm. The procedure involved in the application of the gradient projection 
method can be described by the following steps: 

1. Start with an initial point X i . The point X i has to be feasible, that is, 

g/(X i) < 0, j = 1.2, ... ,m 

2. Set the iteration number as i = 1 . 

3. If X, is an interior feasible point [i.e., if gj (X ,■ ) < 0 for j = 1,2, ... , m], set 
the direction of search as S, = — V/(X,), normalize the search direction as 

= -V/(X,-) 

'' l|V/(X;)|| 

and go to step 5. However, if g ; (X,) = 0 for j — /j, / 2 . . . . , j p , go to step 4. 

4. Calculate the projection matrix P, as 

P» = I — Np(NpN p ) _1 Np 

where 


N„ = [Vg ; 1 (X,)Vg j 2 (X,) . . . Vgj p (Xi)] 
and find the normalized search direction S; as 

-P,V/(X ; ) 

' l|P,V/(X,)|| 

5. Test whether or not S ( - = 0. If S, ^ 0, go to step 6. If S, = 0, compute the 
vector A at X, as 

^ = — (NpN p ) _ 1 NpV/(X,) 

If all the components of the vector A are nonnegative, take X opt = X,- and stop 
the iterative procedure. If some of the components of A are negative, find the 
component \ q that has the most negative value and form the new matrix N /; as 

N p = [Vgjl Vgj2 ■■■ Vgjq-l Vg jq+ 1 Vg jp ] 

and go to step 3. 
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6. If S, / 0, find the maximum step length Xm that is permissible without violating 
any of the constraints as Xm — min(A.^), X\, > 0 and k is any integer among 1 to 
m other than ji, ji, . . . , j P ■ Also find the value of df/dX(XM ) — S/V/'iX, + 
X M S If d f/dX(X M ) is zero or negative, take the step length as a, = X M . On 
the other hand, if df/dX(X M ) is positive, find the minimizing step length X* 
either by interpolation or by any of the methods discussed in Chapter 5, and 
take Xi — X*. 

7. Find the new approximation to the minimum as 

X;+i = X,' + XjSj 

If Xj — Xm or if Xm < /-*, some new constraints (one or more) become active 
at X,-+i and hence generate the new matrix N ;) to include the gradients of all 
active constraints evaluated at X /+ i . Set the new iteration number as i = / + 1, 
and go to step 4. If a, = X* and X* < a m , no new constraint will be active at 
X, + i and hence the matrix N /? remains unaltered. Set the new value of i as 
/=/ + !, and go to step 3. 


Example 7.3 


Minimize f(x i, X 2 ) = x^ + — 2xi — 4x2 

subject to 

g i (xi , x 2 ) = xi + 4x 2 - 5 < 0 
g 2 (x i,x 2 ) = 2xi + 3x 2 - 6 < 0 

g3(x l,x 2 ) = -Xl < 0 
g 4 (x 1 ,X 2 ) = — X '2 < 0 

starting from the point Xi = {] '][}. 


SOLUTION 


Iteration i = 1 


Step 3: Since g, (X i ) =0 for j = 1, we have p — 1 and j\ = 1. 
Step 4: As Ni = [ Vg i (X i ) | = Q], the projection matrix is given by 


"1 o' 


T 


'll' 

0 1 


4 

[1 4] 

4 J 


1 

17 


16 -4 
-4 1 


The search direction Si is given by 


1 

'16 -4 

! °l 

l-l 

l-M 

| -0.4707 

L7 

-4 1 

I-2J 


ftl 

_ | 0.1177 
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as 


V/(XD 


J 2jci -2 

1 _( °1 

12x 2 -4 

lx" 1-2] 


The normalized search direction can be obtained as 


Si = 


[(— 0.4707) 2 + (0.1177) 2 ] 1 / 2 


Step 5: Since Si / 0 , we go step 6. 

Step 6: To find the step length k M , we set 


X = 


| -0.4707 

f —0.9701 

{ 0.1177 

“ { 0.2425 

l + kS 



[ 1.0-0.970U 
1 1.0 + 0.2425A 


For j — 2: 

g 2 (X) = (2.0 - 1.94021) + (3.0 + 0.7275A) -6.0 = 0 at k = k 2 
= -0.8245 


For j = 3: 


g 3 (X) = -(1.0-0.970U) =0 at A. = A. 3 = 1.03 


For j — 4: 


g 4 (X) = -(1.0 + 0.2425k) =0 at k=k 4 = - 4.124 


Therefore, 


Xm — A 3 = 1 .03 


Also, 

/(X) = f(k) = (1.0 — 0.9701A) 2 + (1.0 + 0.2425A.) 2 
- 2(1.0 - 0.9701 A) - 4(1.0 + 0.2425A.) 

= 0.9998A 2 - 0.4850A - 4.0 


— = 1.9996A -0.4850 
dk 


—(k M ) = 1.9996(1.03) -0.4850 = 1.5746 
dk 


As df/dk(k M ) > 0, we compute the minimizing step length k* by setting 
df/dk — 0. This gives 


Ai 


= >4 = 


0.4850 

1.9996 


= 0.2425 
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Step 7: We obtain the new point X 2 as 


X2 = X 1 + A 1 S 1 


h.ol 

n f — 0.9701 

{0.7647 

P-0J 

+ 0.^425 j 0 2425 

_ j 1.0588 


Since Ai = A* and A* < A iW , no new constraint has become active at X 2 and 
hence the matrix N 1 remains unaltered. 


Iteration i = 2 


Step 3: 
Step 4: 


Since gi(X 2 ) = 0, we set p = 1, j\ = 1 and go to step 4. 


N, = 


1 

4 


P 2 = 


1 

17 


16 -4 
-4 1 


A/(X 2 ) = 


|2xi -2 
12 x 2 - 4 


x 2 


[ 1.5294 -2.0 
[2.1 176 — 4.0 


| -0.4706 
| -1.8824 


S 2 = -P 2 V/(X 2 ) = 


1 

OS 

1 

4^ 

{0.4706 

{0.0 

17 

-4 1 

{ 1.8824 

_ jo.o 


Step 5: Since S 2 = 0, we compute the vector A at X 2 as 


A, = 


— (N}N]) -1 N}v/(X 2 ) 

1 r i A-i \ -0.4706 
17 1 4] { — 1.8824 


= 0.4707 > 0 


The nonnegative value of A indicates that we have reached the optimum point 
and hence that 


{0.7647 
j 1.0588 


with / opt = -4.059 


7.9 GENERALIZED REDUCED GRADIENT METHOD 

The generalized reduced gradient (GRG) method is an extension of the reduced gradi- 
ent method that was presented originally for solving problems with linear constraints 
only [7.1 1]. To see the details of the GRG method, consider the nonlinear programming 
problem: 

Minimize /(X) (7.79) 

h y (X ) < 0, j — 1,2,..., m 
Z*(X) = 0, k =1,2,...,/ 

(0 , . (m) -it 

Xj < x; < x- , i = 1 , 2, . . . , n 


subject to 


(7.80) 

(7.81) 

(7.82) 
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By adding a nonnegative slack variable to each of the inequality constraints in 
Eq. (7.80), the problem can be stated as 


Minimize /(X) (7.83) 

subject to 


h i (X ) T x n -\-j — 0, 

7 = 1,2,.. 

, . , m 

(7.84) 

MX) = 0, 

k — 1,2,.. 


(7.85) 

Xj < Xi < x] \ 

i = l,2,.. 

. . , n 

(7.86) 

Xn+j — 9) 

j — 1.2,.. 

. . , m 

(7.87) 


with n + m variables (x\, X 2 , . . . ,x n , x n+ \ , . . . , x n+m ). The problem can be rewritten 
in a general form as: 

Minimize /(X) (7.88) 

subject to 

gj(X)=0, j = 1,2, (7.89) 

x- l> < Xj < x\ u \ i — 1, 2, ...,« + m (7.90) 


where the lower and upper bounds on the slack variable, jq, are taken as 0 and a large 
number (infinity), respectively ( i — n + 1, n + 2, . . . , n + m). 

The GRG method is based on the idea of elimination of variables using the equality 
constraints (see Section 2.4.1). Thus theoretically, one variable can be reduced from 
the set Xi (i — \ .2, .... n + m ) for each of the m + 1 equality constraints given by 
Eqs. (7.84) and (7.85). It is convenient to divide the n + m design variables arbitrarily 
into two sets as 


X = 


z = 


>’l 

y n -i 

Zl 
Z 2 

Zm+l 


design or independent variables 


(7.91) 


(7.92) 


= state or dependent variables 


(7.93) 


and where the design variables are completely independent and the state variables 
are dependent on the design variables used to satisfy the constraints gj (X ) =0, j — 
1, 2, . . . , m + 1. 
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Consider the first variations of the objective and constraint functions: 


n—l 


9/ 


d/(X) = V ^-dy f 


9/ 


m+1 

^9^ 


dzi = Vy/t/Y + vj/dz 


dg,(X) = 


dgi 



( 7 . 94 ) 


or 

c/g = [C] r/Y + [£>] r/Z 


where 


V Y / = 


Vz/ = 


_9/ ' 
9yi 
V. 
9.V2 

9/ 

9y«-/ 

9/ 

9zi 

9/ 

dZ2 

9/ 

9Zm+/ 


9gi 

9gi 

9ji 

dy„-i 


d&m+l 

9.yt 

dy„-i _ 

9gl 

9gi 

9zi 


9gm+/ 

9^m+/ 

9zi 

dZm+l _ 


( 7 . 95 ) 


( 7 . 96 ) 


( 7 . 97 ) 


( 7 . 98 ) 


( 7 . 99 ) 
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dY = 


dy i 

dy2 

dy n -i 


dzi 

dz .2 


dZm+l 


(7.100) 


(7.101) 


Assuming that the constraints are originally satisfied at the vector X, (g(X) = 0), any 
change in the vector dX must correspond to dg = 0 to maintain feasibility at X + dX. 
Equation (7.95) can be solved to express dZ as 

dZ = -[D]~ l [C]dY (7.102) 


The change in the objective function due to the change in X is given by Eq. (7.94), 
which can be expressed, using Eq. (7.102), as 

df(X) = (Vjf- V$f[Dr l [C])dY (7.103) 

or df 

^-(X) = G S (7.104) 

where 

G* = Vy/-([£>r 1 [C]) T V z / (7.105) 

is called the generalized reduced gradient. Geometrically, the reduced gradient 
can be described as a projection of the original /(-dimensional gradient onto the 
(n — m)-dimensional feasible region described by the design variables. 

We know that a necessary condition for the existence of a minimum of an uncon- 
strained function is that the components of the gradient vanish. Similarly, a constrained 
function assumes its minimum value when the appropriate components of the reduced 
gradient are zero. This condition can be verified to be same as the Kuhn-Tucker con- 
ditions to be satisfied at a relative minimum. In fact, the reduced gradient G r can be 
used to generate a search direction S to reduce the value of the constrained objective 
function similar to the gradient V/ that can be used to generate a search direction S 
for an unconstrained function. A suitable step length X is to be chosen to minimize 
the value of / along the search direction S. For any specific value of X, the dependent 
variable vector Z is updated using Eq. (7.102). Noting that Eq. (7.102) is based on 
using a linear approximation to the original nonlinear problem, we find that the con- 
straints may not be exactly equal to zero at X, that is, d g ^ 0. Hence when Y is held 
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fixed, in order to have 


gi(X) + dgi (X ) = 0, i = 1 , 2, . . . , m + / 


(7.106) 


we must have 


g(X) + </g(X) = 0 


(7.107) 


Using Eq. (7.95) for cl g in Eq. (7.107), we obtain 

dZ = [£>]-' (— g(X)-[C]r/Y) 


(7.108) 


The value of d Z given by Eq. (7.108) is used to update the value of Z as 


The constraints evaluated at the updated vector X, and the procedure [of finding d Z 
using Eq. (7.108)] is repeated until d Z is sufficiently small. Note that Eq. (7.108) can 
be considered as Newton’s method of solving simultaneous equations for cl Z. 


1. Specify the design and state variables . Start with an initial trial vector X . Identify 
the design and state variables (Y and Z) for the problem using the following 
guidelines. 

(a) The state variables are to be selected to avoid singularity of the matrix, [£>]. 

(b) Since the state variables are adjusted during the iterative process to maintain 
feasibility, any component of X that is equal to its lower or upper bound 
initially is to be designated a design variable. 

(c) Since the slack variables appear as linear terms in the (originally inequality) 
constraints, they should be designated as state variables. However, if the 
initial value of any state variable is zero (its lower bound value), it should 
be designated a design variable. 

2. Compute the generalized reduced gradient. The GRG is determined using 
Eq. (7.105). The derivatives involved in Eq. (7.105) can be evaluated 
numerically, if necessary. 

3. Test for convergence . If all the components of the GRG are close to zero, the 
method can be considered to have converged and the current vector X can be 
taken as the optimum solution of the problem. For this, the following test can 
be used: 


where e is a small number. If this relation is not satisfied, we go to step 4. 

4. Determine the search direction. The GRG can be used similar to a gra- 
dient of an unconstrained objective function to generate a suitable search 
direction, S. The techniques such as steepest descent, Fletcher-Reeves, 
Davidon-Fletcher-Powell. or Broydon-Fletcher-Goldfarb-Shanno methods 



(7.109) 


Algorithm 


7.9 Generalized Reduced Gradient Method 417 


can be used for this purpose. For example, if a steepest descent method is 
used, the vector S is determined as 

S = -G r (7.110) 


5 . Find the minimum along the search direction. Although any of the one 
-dimensional minimization procedures discussed in Chapter 5 can be used 
to find a local minimum of / along the search direction S, the following 
procedure can be used conveniently. 

(a) Find an estimate for X as the distance to the nearest side constraint. When 
design variables are considered, we have 


1 = 


y; U) - (ttOold 


if Si > 0 


y! n - (y;)oid . 


(7.111) 


if Sj < 0 


where Si is the /th component of S. Similarly, when state variables are 
considered, we have, from Eq. (7.102), 


dZ = -[D] _1 [C]dY 


(7.112) 


Using c/Y = aS, Eq. (7.112) gives the search direction for the variables 
Z as 


T = -[£»r 1 [C]S 


(7.113) 


Thus 


X = 


zf’ - (Zj )old 

U 

z\' ] ~ fo)old 

h 


if tj > 0 


if tj < 0 


(7.114) 


where is the z th component of T. 

(b) The minimum value of X given by Eq. (7.111), X i , makes some design 
variable attain its lower or upper bound. Similarly, the minimum value of 
X given by Eq. (7.114), 7.2, will make some state variable attain its lower 
or upper bound. The smaller of Xi or Xi can be used as an upper bound 
on the value of X for initializing a suitable one-dimensional minimization 
procedure. The quadratic interpolation method can be used conveniently for 
finding the optimal step length X*. 

(c) Find the new vector X liew : 


X 


(Yom + z/yI _ jY old + rsj 
j Z 0 id + dZ J j Z 0 i d + 7.*T j 


new 


(7.115) 
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If the vector X new corresponding to X* is found infeasible, then Y new is held 
constant and Z new is modified using Eq. (7.108) with d Z = Z new — Z 0 id. 
Finally, when convergence is achieved with Eq. (7.108), we find that 


X 


new 


| Y old + AY | 

{ Zold + AZ j 


(7.116) 


and go to step 1. 


Example 7.4 


Minimize / (x i, X 2 , x 3 ) — (xi — xi) 1 + (*2 — x 3 ) 4 


subject to 

gi(X) = xi(l + xf) +.*3 - 3 = 0 
—3 < Xi <3, i — 1, 2, 3 


using the GRG method. 

SOLUTION 

Step 1: We choose arbitrarily the independent and dependent variables as 

Y = 

Let the starting vector be 

X, = 


hi 

- \ Xl 

1X2. 

[X2 


z = {zi} = {* 3 } 


- 2.6 

2 

2 


with /(XO = 21.16. 

Step 2: Compute the GRG at X 1 . Noting that 

df 

- — = 2(xi - x 2 ) 

OX 1 


df 3 

- — = — 2(xi - xi) + 4(x 2 - x 3 ) J 
ox 2 

df 3 

— = -4(x 2 - x 3 ) 3 

d^3 

^i- 1 +x 2 

9 X1 - 1+X2 

dg 1 - 

= 2*1X2 

9*2 

^£1 - \ x \ 

9*3 3 
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we find, at Xi, 
V Y / = 


9/ 


Vz /= v- 


3xi 

I 2(— 2.6 - 2) 

f —9.2] 

9/ 

| — 2(— 2.6 — 2) + 4(2 — 2) 3 

” j 9.2 

9x 2 

Xi 


df_ 

= {-4(x 2 — x 3 ) 3 }x , = 0 


9x 3 

X| 



[C] = 

[£»] = 


9gl dgl 

3xi 3x2 
'9gi _ 


[5 -10.4] 


-lx , 

= [32] 


l_9*3j Xl 

D- 1 = [^], [£»]“‘[C] = ^[5 -10.4] = [0.15625 -0.325] 

Gr = V Y / — [[£>] _ 1 [C]] T V Z / 


| —9.2 

{ 0.15625 


-9.2 

( 9.2 

\ -0.325 

(0) = 

9.2 


Step 3: Since the components of G r are not zero, the point X ] is not optimum, and 
hence we go to step 4. 

Step 4: We use the steepest descent method and take the search direction as 


S = -G, 


9.2 

-9.2 


Step 5: We find the optimal step length along S. 

(a) Considering the design variables, we use Eq. (7.1 1 1) to obtain For y\ = xp. 

3 - (-2.6) 


For y 2 = x 2 \ 


X 


/ = 


9.2 


(2) 


-9.2 


0.6087 


= 0.5435 


Thus the smaller value gives 7. i = 0.5435. Equation (7.113) gives 


-([£>]“ [C])S = -(0.15625 -0.325) 


9.2 

-9.2 


= -4.4275 


and hence Eq. (7.114) leads to 


For zi — X3 : X — 


-3 - (2) 
-4.4275 


= 1.1293 


Thus A 2 = 1.1293. 


420 


Nonlinear Programming III: Constrained Optimization Techniques 


(b) The upper bound on A is given by the smaller of Ai and A 2 , which is equal 
to 0.5435. By expressing 


Y _ }Y + AS 
{z+AT 

we obtain 



Xi 


-2.6 


9.2 


—2.6 + 9.2A 

X = 

*2 

• = • 

2 

+ A • 

-9.2 

• = « 

2 - 9.2A 


*3 


2 


-4.4275 


2 - 4.4275A 


and hence 

/(A) = /(X) = (-2.6 + 9.2A - 2 + 9.2A) 2 
+ (2 - 9.2A - 2 + 4.4275A) 4 
= 518.7806A 4 + 338. 56A 2 - 169.28A + 21.16 


df /dX — 0 gives 


2075. 1225 A 3 +677.1 2A - 169.28 = 0 


from which we find the root as A* ~ 0.22. Since A* is less than the upper 
bound value 0.5435, we use A*. 

(c) The new vector X new is given by 


X 


new 


(Yoid + r/Yj 

{ Zoid + dZ j 


jY old + A*Sj 

j Zoid + } 


-2.6 + 0.22(9.2) ' 


-0.576 

2 + 0.22(— 9.2) 

■ = ■ 

-0.024 

2 + 0.22(— 4.4275) 


1.02595 


with 


r/Y 


| 2.024 
{-2.024 


d Z = {-0.97405} 


Now, we need to check whether this vector is feasible. Since 
gi(Xnew) = ( — 0.576) [1 + (-0.024) 2 ] + (1.02595) 4 - 3 = -2.4684 / 0 


the vector X new is infeasible. Hence we hold Y new constant and modify 
Z new using Newton’s method [Eq. (7.108)] as 


dZ =[£>]- 1 [-g(X)-[C]rfY] 
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Since 


[£>] = 


3g 1 
dzi 


[C] = 


dgi dg l 
_dyi dy 2 _ 


= [4x 3 3 ] = [4(1 .02595) 3 ] = [4.319551] 


g t(X) = {-2.4684} 

{[2( — 0.576 + 0.024)][—2(— 0.576 + 0.024) 


+ 4(— 0.024 — 1.02595) 3 ]} 
[-1.104 -3.5258] 

1 


dZ = 


4.319551 


2.4684- {-1.104 -3.5258} 
= {-0.5633} 


2.024 

-2.024 


we have Z new = Z 0 id + dZ = [2 — 0.5633} = {1.4367}. The current X new 
becomes 


X 


new 


| Y old + r/Y | 

[ Z 0 id + dZ J 


-0.576 

-0.024 

1.4367 


The constraint becomes 


gi = (—0.576X1— (-0.024) 2 ) + (1.4367) 4 - 3 = 0.6842 ± 0 


Since this X new is infeasible, we need to apply Newton’s method 
[Eq. (7.108)] at the current X new . In the present case, instead of repeating 
Newton’s iteration, we can find the value of Z new = tainew by satisfying 
the constraint as 


g\ (X ) — ( — 0.576) [1 + (-0.024) 2 ] + * 4 - 3 = 0 
or x 3 = (2.4237) 0 ' 25 = 1.2477 


This gives 


X 


new 


-0.576 
-0.024 
1 .2477 


and 


/(X new ) = (-0.576 + 0.024) 2 + (-0.024 - 1.2477) 4 = 2.9201 


Next we go to step 1. 

Step 1: We do not have to change the set of independent and dependent variables and 
hence we go to the next step. 
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Step 2: We compute the GRG at the current X using Eq. (7.105). Since 

3/ 


V Y / = 


Vz/ = 

[C] = 


[D] 


dx\ 

3 / 

3x 2 

-1.104 
-7.1225 

3/ 

3zi 

3gi 3gi 
3xi 3x2 
= [1.000576 0.027648] 

3gi 


2(-0.576 + 0.024) 

—2(— 0.576 + 0.024) + 4(-0.024 - 1 ,2477) 3 


3X3 


= {—4(— 0.024 - 1.2477) 3 } = {8.2265} 


= [(1 + (— 0.024) 2 ) 2(— 0.576) (—0.024)] 


[£>] _1 [C] 


3X3 

1 


= [4x§] = [4(1.2477) ] = [7.7694] 

[1.000576 0.027648] = [0.128784 0.003558] 


7.7694 

G* = Vy/-[[Z7]- 1 [C]] t V z / 


1-1.104 

JO.128784 


[-2.1634 

|-7.1225 

[0.003558 

(8.2265) — 

[-7.1518 


Since G r / 0, we need to proceed to the next step. 

Note: It can be seen that the value of the objective function reduced from an initial 
value of 21.16 to 2.9201 in one iteration. 


7.10 SEQUENTIAL QUADRATIC PROGRAMMING 

The sequential quadratic programming is one of the most recently developed and per- 
haps one of the best methods of optimization. The method has a theoretical basis that 
is related to (1) the solution of a set of nonlinear equations using Newton’s method, 
and (2) the derivation of simultaneous nonlinear equations using Kuhn-Tucker con- 
ditions to the Lagrangian of the constrained optimization problem. In this section we 
present both the derivation of the equations and the solution procedure of the sequential 
quadratic programming approach. 

7.10.1 Derivation 

Consider a nonlinear optimization problem with only equality constraints: 

Find X which minimizes /(X) 


subject to 


h k (X) — 0, k — 1,2, , p 


(7.117) 
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The extension to include inequality constraints will be considered at a later stage. The 
Lagrange function, L(X, A.), corresponding to the problem of Eq. (7.117) is given by 

p 

L = /(X) + £A*MX) (7.118) 

k= 1 

where A* is the Lagrange multiplier for the Ath equality constraint. The Kuhn-Tucker 
necessary conditions can be stated as 

p 

VL = 0 or VZ + ^AfcV/i* = 0 or V/ + [A] T A = 0 (7.119) 

k= 1 

h k (X) = 0, A =1,2, (7.120) 

where [A] is an n x p matrix whose Ath column denotes the gradient of the function 
h k . Equations (7.119) and (7.120) represent a set of n + p nonlinear equations in 
n + p unknowns (x,-, i — 1 , ■ ■ • , n and A*, A = 1 , . . . , p). These nonlinear equations 
can be solved using Newton's method. For convenience, we rewrite Eqs. (7.119) and 
(7.120) as 


F (Y ) = 0 


(7.121) 


where 


F = 



(n+p)x 1 



, 0 = 

(n+p)x 1 



(n+p)x 1 


(7.122) 


According to Newton’s method, the solution of Eqs. (7.121) can be found iteratively 
as (see Section 6.11) 


Y j+\ = Y ; ■ + AY j (7.123) 

with 

[VF]jAY; = -F(Y,-) (7.124) 

where Y j is the solution at the start of jth iteration and AY , is the change in Y 7 
necessary to generate the improved solution, Y ;+ i, and [V F\j = [VF(Y ; )] is the (n + 
p) x (n + p) Jacobian matrix of the nonlinear equations whose /th column denotes the 
gradient of the function F- t (Y ) with respect to the vector Y . By substituting Eqs. (7.121) 
and (7.122) into Eq. (7.124), we obtain 

'[V 2 L] [H] 

_[H] T [ 0 ] 

AX, 

AA j 


AX 

AA 


VL | 

h 


J v 9 J v 9 J 

= X /+1 -X, 

— ^7 + 1 — 


(7.125) 

(7.126) 

(7.127) 
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where [V 2 L]„ Xfl denotes the Hessian matrix of the Lagrange function. The first set of 
equations in (7.125) can be written separately as 


[V 2 L]jAXj + [H]jAXj = -VLj (7.128) 

Using Eq. (7.127) for AXj and Eq. (7.119) for VLj, Eq. (7.128) can be expressed as 
[V 2 L] j AX y + [H]j(X j+l - X-) = -Vfj - m'jXj (7.129) 


which can be simplified to obtain 


[V 2 L] ; AX ; + [H]jX j+l = -Vfj (7.130) 

Equation (7.130) and the second set of equations in (7.125) can now be combined as 

"[V 2 L] [ H ] 

_[H ] T [0] 

Equations (7.131) can be solved to find the change in the design vector AXj and 
the new values of the Lagrange multipliers, Xj + 1 . The iterative process indicated by 
Eq. (7.131) can be continued until convergence is achieved. 

Now consider the following quadratic programming problem: 

Find AX that minimizes the quadratic objective function 
Q = V/ T AX + ±AX t [V 2 L]AX 

subject to the linear equality constraints (7.132) 

hk + Vh T k AX=0, k = l,2,...,p or h + [H] T AX = 0 

The lagrange function, L, corresponding to the problem of Eq. (7.132) is given by 

L = V/ r AX + |AX T [V 2 L]AX + J2 k(h k + V/tJ AX) (7.133) 

k = 1 

where X k is the Lagrange multiplier associated with the Arth equality constraint. 

The Kuhn-Tucker necessary conditions can be stated as 

V/ + [V 2 L]AX + [H]X = 0 (7.134) 

h k + V/ijAX = 0, k=l,2,...,p (7.135) 

Equations (7.134) and (7.135) can be identified to be same as Eq. (7.131) in matrix 
form. This shows that the original problem of Eq. (7.117) can be solved iteratively 
by solving the quadratic programming problem defined by Eq. (7.132). In fact, when 
inequality constraints are added to the original problem, the quadratic programming 
problem of Eq. (7.132) becomes 

Find X which minimizes Q — V/ T AX + ^AX t [V 2 L]AX 


j 



(7.131) 
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subject to 

g; + v gJ AX < J = 1.2, 

h k + Vhl AX=0, k = \,2,...,p (7.136) 

with the Lagrange function given by 

m p 

L = /(X) + ^kjgjCi) + j2k m+k h k (X) (7.137) 

.7 = 1 k = 1 

Since the minimum of the augmented Lagrange function is involved, the sequential 
quadratic programming method is also known as the projected Lagrangian method. 


7.10.2 Solution Procedure 

As in the case of Newton’s method of unconstrained minimization, the solution vector 
AX in Eq. (7.136) is treated as the search direction, S, and the quadratic programming 
subproblem (in terms of the design vector S) is restated as: 

Find S which minimizes <2(S) = V/(X) T S + irS T [//]S 

subject to 

Pjgj(X) + Vg ; (X) T S <0, j = 1, 2, . . . , m 
]ih k (X) + Vh k (Xf S = 0, k = l,2,...,p (7.138) 


where [//] is a positive definite matrix that is taken initially as the identity matrix 
and is updated in subsequent iterations so as to converge to the Hessian matrix of the 
Lagrange function of Eq. (7.137), and fij and fi are constants used to ensure that the 
linearized constraints do not cut off the feasible space completely. Typical values of 
these constants are given by 


76 


0.9; 


ifg;(X)<0 
\p if g j (X ) > 0 


(7.139) 


The subproblem of Eq. (7.138) is a quadratic programming problem and hence the 
method described in Section 4.8 can be used for its solution. Alternatively, the problem 
can be solved by any of the methods described in this chapter since the gradients of the 
function involved can be evaluated easily. Since the Lagrange multipliers associated 
with the solution of the problem, Eq. (7.138), are needed, they can be evaluated using 
Eq. (7.263). Once the search direction, S, is found by solving the problem in Eq. (7. 138), 
the design vector is updated as 


Xj+i =Xj +a*S (7.140) 

where a* is the optimal step length along the direction S found by minimizing the 
function (using an exterior penalty function approach): 

m p 

<t> = /(X) + ^Mmax[0,g ; (X)]) + £ hm+k, \h k (X)\ 
j = 1 *= 1 


(7.141) 
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with 

I [a / |. j — 1 , 2 ,..., m + p in first iteration 

i ~ 

max{|A. ; |, ^ (kj , [A ; |)}in subsequent iterations 


(7.142) 


and kj — kj of the previous iteration. The one-dimensional step length a* can be found 
by any of the methods discussed in Chapter 5. 

Once X j + 1 is found from Eq. (7.140), for the next iteration the Hessian matrix [ H \ 
is updated to improve the quadratic approximation in Eq. (7.138). Usually, a modified 
BFGS formula, given below, is used for this purpose [7.12]: 


[H i+l \ = [H^ - 


[f/,]P,pT[/7,] 

Pj[Hi]Pi 



(7.143) 


P=X, 


+i 


Y = 0Qi + (1 - 0)[Hi]Pi 


(7.144) 

(7.145) 


Q,- - V v L(X /+1 , k i+ 1) - V x L(Xi, ki ) (7.146) 


1.0 


0.8P, r 

Pf[Hi]Pi - PjQi 


if PjQi>0.2Pj[Hi]Pi 
if PjQi <0.2Pj[Hi]Pi 


(7.147) 


where L is given by Eq. (7.137) and the constants 0.2 and 0.8 in Eq. (7.147) can be 
changed, based on numerical experience. 


Example 7.5 Find the solution of the problem (see Problem 1.31): 


subject to 


Minimize /(X) = O.l.ri +0.05773x2 


„ 0.6 0.3464 

i(X) = — + 0.1 <0 

XI X2 

) (X ) — 6 x i <0 


using the sequential quadratic programming technique. 


(Ei) 

(E 2 ) 

(E 3 ) 

(E 4 ) 


SOLUTION Let the starting point be Xi = (11.8765, 7.0) T with gi(Xi) = g 3 (Xi) = 
0, g 2 (Xi) = —5.8765, and /(X i) = 1.5917. The gradients of the objective and con- 
straint functions at X i are given by 


V/(X0 = 




-0.6 


f 0 - 1 

, Vgi(X0 = 

A 

f —0.004254] 

jo. 05773 

SD 

O 

1 

" 1 -0.007069 
x. 



Vg 3 (X0 = 


V*2(Xi) = 


0 

-1 
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We assume the matrix [77i] to be the identity matrix and hence the objective function 
of Eq. (7.138) becomes 

2(S) = O.lsi + 0.05773*2 + 0.5*5* + 0-5*f (E 5 ) 

Equation (7.139) gives fa — fa — 0 since g\ — g 3 = 0 and fa — 1-0 since g 2 < 0, and 
hence the constraints of Eq. (7.138) can be expressed as 


gi = -0.004254*! - 0.007069*2 < 0 
g2 — —5.8765 — *i < 0 
g3 — ~s 2 < 0 


(E 6 ) 

(E 7 ) 

(Eg) 


We solve this quadratic programming problem [Eqs. (E5) to (Eg)] directly with the use 
of the Kuhn -Tucker conditions. The Kuhn -Tucker conditions are given by 


82 '£^ = 0 


9*1 

92 

9*2 


7=1 

3 


9*1 

a Ji = 0 

9*2 


7=1 

*7*7= 0’ 7 = 1.2,3 

gj< 0, j = 1,2,3 

Xj> 0, 7 = 1,2, 3 

Equations (Eg) and (Em) can be expressed, in this case, as 

0.1 + *i - 0.0042543.! - X 2 = 0 
0.05773 + s 2 - 0.007069k 1 - k 3 = 0 


(Eg) 

(E10) 

(E n ) 

(E12) 

(E13) 

(E m ) 

(E15) 


By considering all possibilities of active constraints, we find that the optimum solution 
of the quadratic programming problem [Eqs. (E5) to (Eg)] is given by 

*1 = -0.04791, *2 = 0.02883, fa = 12.2450, k* = 0, k£ = 0 

The new design vector, X, can be expressed as 


X =Xi +aS = 


11.8765 -0.04791a 
7.0 + 0.02883a 


where a can be found by minimizing the function 0 in Eq. (7.141): 
0 = 0.1(1 1.8765 - 0.04791a) + 0.05773(7.0 + 0.02883a) 


+ 12.2450 


0.6 


+ 


0.3464 


11.8765 -0.04791a 7.0 + 0.02883a 


- 0.1 
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By using quadratic interpolation technique (unrestricted search method can also be used 
for simplicity), we find that cf> attains its minimum value of 1.48 at a* — 64.93, which 
corresponds to the new design vector 


X, = 


8.7657 

8.8719 


with /(X 2) = 1.38874 and gi(X 2 ) = +0.0074932 (violated slightly). Next we update 
the matrix | H | using Eq. (7.143) with 


.0.6 0.3464 

L = O.Ijci + 0.05773x2 + 12.2450 b 

Xi X2 


0.1 


VvL = 


dL 

3xi 

dL 

3 x 2 


dL 

with = 0.1 

3xi 


7.3470 


3 L 4.2417 

and = 0.05773 5— 

3X2 Xt 

Pi =X 2 -X, 


-3.1108 

1.8719 


Q, = V x L(X 2 )-V,L(X 1 ) = 


0.00438 

0.00384 


| 0.04791 

I -0.04353 

| -0.02883 

“ j 0.03267 


PfrtfilPi = 13.1811, P|Qi =0.19656 

This indicates that P|Q 1 < 0.2P|[//i]Pi, and hence 6 is computed using Eq. (7.147) as 

(0.8)(13.1811) 


6 = 


13.1811 -0.19656 
Y =6Q l +(l-9)[H l ]P l = 


= 0.81211 


0.54914 

-0.32518 


Hence 


[H 2 \ = 


0.2887 0.4283 
0.4283 0.7422 


We can now start another iteration by defining a new quadratic programming problem 
using Eq. (7.138) and continue the procedure until the optimum solution is found. 
Note that the objective function reduced from a value of 1.5917 to 1.38874 in one 
iteration when X changed from Xi to X 2 . 


Indirect Methods 

7.11 TRANSFORMATION TECHNIQUES 

If the constraints g ; (X) are explicit functions of the variables x,- and have certain simple 
forms, it may be possible to make a transformation of the independent variables such 
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that the constraints are satisfied automatically [7.13]. Thus it may be possible to convert 
a constrained optimization problem into an unconstrained one by making a change of 
variables. Some typical transformations are indicated below: 

1. If lower and upper bounds on x,- are specified as 

/,■ < Xj < Ui (7.148) 


these can be satisfied by transforming the variable x, as 

x t = l t + (uj - /,)sin 2 y,- (7.149) 


where y, is the new variable, which can take any value. 

2 . If a variable x, is restricted to lie in the interval (0, 1), we can use the transfor- 
mation: 


• 2 

Xj = sin yi , Xi — cos yt 


X; = 


e yi 
e- V! ' + e- 


or Xj — 


i +yf 


(7.150) 


3 . If the variable x,- is constrained to take only positive values, the transformation 
can be 


x, = abs(y,), x,- = yj or x,- = e yi (7.151) 


4 . If the variable is restricted to take values lying only in between —1 and 1, the 
transformation can be 


Xj 


sin y t , Xj = cos y , , 



(7.152) 


Note the following aspects of transformation techniques: 

1. The constraints g ; (X) have to be very simple functions of x, . 

2 . For certain constraints it may not be possible to find the necessary transfor- 
mation. 

3 . If it is not possible to eliminate all the constraints by making a change of 
variables, it may be better not to use the transformation at all. The partial 
transformation may sometimes produce a distorted objective function which 
might be more difficult to minimize than the original function. 

To illustrate the method of transformation of variables, we consider the following 
problem. 


Example 7.6 Find the dimensions of a rectangular prism-type box that has the largest 
volume when the sum of its length, width, and height is limited to a maximum value 
of 60 in. and its length is restricted to a maximum value of 36 in. 


SOLUTION Let xi, X 2 , and X 3 denote the length, width, and height of the box, 
respectively. The problem can be stated as follows: 


(Ei) 


Maximize /(x 1, X2, X3) = X1X2X3 
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subject to 


X\ + X 2 + X3 < 60 

x\ < 36 

x\ >0, i — 1,2,3 


By introducing new variables as 


or 


y i=x u yi=x 2 , 
xi = yi, x 2 — >’2, 


y 3 =Xi+x 2 +x 3 

*3 = T3 - y\ - T2 


the constraints of Eqs. (E 2 ) to (E4) can be restated as 

0 < yi < 36, 0 < y 2 < 60, 0 < y 3 < 60 


(E 2 ) 

(E 3 ) 

(E 4 ) 

(E 5 ) 

(E 6 ) 

(Ey) 


where the upper bound, for example, on y 2 is obtained by setting x\ = x 3 — 0 in 
Eq. (Ey). The constraints of Eq. (E 7 ) will be satisfied automatically if we define new 
variables Zi, i — 1, 2, 3, as 

yi=36sin 2 zi, y 2 = 60sin 2 Z2, y3 = 60sin 2 z 3 (Eg) 

Thus the problem can be stated as an unconstrained problem as follows: 

Maximize f(z,\,z 2 , z 3 ) 

= yiyiiy* - yi - yi) (e 9 ) 

= 2160 sin 2 z\ sin 2 z 2 (60 sin 2 z 3 — 36 sin 2 zi — 60 sin 2 z 2 ) 

The necessary conditions of optimality yield the relations 

T) -C 

= 259,200 sin zi coszi sin 2 z 2 (sin 2 z 3 — I sin 2 zi — sin 2 Z2) = 0 (E10) 

3zi 

- — = 518,400 sin 2 zi sinz 2 cosz2(^ sin 2 z 3 - -^sin 2 zi - sin 2 z2) = 0 (En) 
3Z2 

TJ -C 

= 259,200 sin 2 zi sin 2 Z2 sin z 3 cosz 3 = 0 (Ep) 

dZ3 

Equation (Ei 2 ) gives the nontrivial solution as cos z 3 — 0 or sin 2 z 3 = 1. Hence 
Eqs. (Em) and (En) yield sin 2 zi = | and sin 2 z 2 = -I. Thus the optimum solution is 
given by x* — 20 in., x* — 20 in., vj = 20 in., and the maximum volume = 8000 in 3 . 


7.12 BASIC APPROACH OF THE PENALTY FUNCTION METHOD 

Penalty function methods transform the basic optimization problem into alternative 
formulations such that numerical solutions are sought by solving a sequence of 
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unconstrained minimization problems. Let the basic optimization problem, with 
inequality constraints, be of the form: 

Find X which minimizes /(X) 


subject to 


g/(X) < 0, y' = l,2, ...,m 


(7.153) 


This problem is converted into an unconstrained minimization problem by constructing 
a function of the form 

m 

<Pt = 0 (X, r k ) = f(X)+r k J2 Gj[gj(X )] (7.154) 

7=1 


where G, is some function of the constraint gj , and r k is a positive constant known 
as the penalty parameter. The significance of the second term on the right side of 
Eq. (7.154), called the penalty term, will be seen in Sections 7.13 and 7.15. If the 
unconstrained minimization of the tp function is repeated for a sequence of values of 
the penalty parameter r k {k — 1,2,...), the solution may be brought to converge to 
that of the original problem stated in Eq. (7.153). This is the reason why the penalty 
function methods are also known as sequential unconstrained minimization techniques 
(SUMTs). 

The penalty function formulations for inequality constrained problems can be 
divided into two categories: interior and exterior methods. In the interior formulations, 
some popularly used forms of Gj are given by 


Gj 8jW 

Gj =log[— g ; (X)] 


(7.155) 

(7.156) 


Some commonly used forms of the function G ; in the case of exterior penalty function 
formulations are 


Gj = max[0, gj {X)\ (7.157) 

Gj — {max[0, g;(X)]} 2 (7.158) 

In the interior methods, the unconstrained minima of cf> k all lie in the feasible region 
and converge to the solution of Eq. (7.153) as r k is varied in a particular manner. In 
the exterior methods, the unconstrained minima of cp k all lie in the infeasible region 
and converge to the desired solution from the outside as r k is changed in a specified 
manner. The convergence of the unconstrained minima of <p k is illustrated in Fig. 7.10 
for the simple problem 

Find X = {a'i } which minimizes /(X) = ax\ 


subject to 


gi(X) = -xi < 0 


(7.159) 
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(a) ( b ) 


Figure 7.10 Penalty function methods: (a) exterior method; ( b ) interior method. 


It can be seen from Fig. 7.10a that the unconstrained minima of 0(X, r^) converge 
to the optimum point X* as the parameter r * is increased sequentially. On the other 
hand, the interior method shown in Fig. 7.1 0Z? gives convergence as the parameter /y. 
is decreased sequentially. 

There are several reasons for the appeal of the penalty function formulations. One 
main reason, which can be observed from Fig. 7.10, is that the sequential nature of 
the method allows a gradual or sequential approach to criticality of the constraints. In 
addition, the sequential process permits a graded approximation to be used in analysis 
of the system. This means that if the evaluation of / and gj [and hence 0(X, r&)] 
for any specified design vector X is computationally very difficult, we can use coarse 
approximations during the early stages of optimization (when the unconstrained minima 
of (f)i- are far away from the optimum) and finer or more detailed analysis approximation 
during the final stages of optimization. Another reason is that the algorithms for the 
unconstrained minimization of rather arbitrary functions are well studied and generally 
are quite reliable. The algorithms of the interior and the exterior penalty function 
methods are given in Sections 7.13 and 7.15. 


7.13 INTERIOR PENALTY FUNCTION METHOD 

As indicated in Section 7.12, in the interior penalty function methods, a new function 
(0 function) is constructed by augmenting a penalty term to the objective function. The 
penalty term is chosen such that its value will be small at points away from the con- 
straint boundaries and will tend to infinity as the constraint boundaries are approached. 
Hence the value of the 0 function also “blows up” as the constraint boundaries are 
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approached. This behavior can also be seen from Fig. 7.10h. Thus once the uncon- 
strained minimization of 0(X, r k ) is started from any feasible point Xi, the subsequent 
points generated will always lie within the feasible domain since the constraint bound- 
aries act as barriers during the minimization process. This is why the interior penalty 
function methods are also known as barrier methods . The 0 function defined originally 
by Carroll [7.14] is 


It can be seen that the value of the function 0 will always be greater than / since gj (X ) 
is negative for all feasible points X. If any constraint gj (X ) is satisfied critically (with 
equality sign), the value of 0 tends to infinity. It is to be noted that the penalty term in 
Eq. (7.160) is not defined if X is infeasible. This introduces serious shortcoming while 
using the Eq. (7.160). Since this equation does not allow any constraint to be violated, 
it requires a feasible starting point for the search toward the optimum point. However, 
in many engineering problems, it may not be very difficult to find a point satisfying 
all the constraints, gj (X ) < 0, at the expense of large values of the objective function, 
/(X). If there is any difficulty in Ending a feasible starting point, the method described 
in the latter part of this section can be used to find a feasible point. Since the initial 
point as well as each of the subsequent points generated in this method lies inside the 
acceptable region of the design space, the method is classified as an interior penalty 
function formulation . Since the constraint boundaries act as barriers, the method is also 
known as a barrier method. The iteration procedure of this method can be summarized 
as follows. 

Iterative Process 

1. Start with an initial feasible point Xi satisfying all the constraints with strict 
inequality sign, that is, gy(Xi) < 0 for j = 1,2, ... , m, and an initial value of 
r\ > 0. Set k — 1. 

2. Minimize 0 ( X . rf) by using any of the unconstrained minimization methods 
and obtain the solution X* k . 

3. Test whether X^ is the optimum solution of the original problem. If X^ is found 
to be optimum, terminate the process. Otherwise, go to the next step. 

4. Find the value of the next penalty parameter, r*+i, as 


where c < 1. 

5. Set the new value of k = k + 1, take the new starting point as X | — Xf and 
go to step 2. 

Although the algorithm is straightforward, there are a number of points to be considered 
in implementing the method: 

1. The starting feasible point X i may not be readily available in some cases. 

2. A suitable value of the initial penalty parameter (ri) has to be found. 

3. A proper value has to be selected for the multiplication factor, c. 



(7.160) 


n+i = cr k 
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4 . Suitable convergence criteria have to be chosen to identify the optimum point. 

5. The constraints have to be normalized so that each one of them vary between 
— 1 and 0 only. 

All these aspects are discussed in the following paragraphs. 

Starting Feasible Point X i. In most engineering problems, it will not be very difficult 
to find an initial point X | satisfying all the constraints, g ; -(X i) < 0. As an example, 
consider the problem of minimum weight design of a beam whose deflection under a 
given loading condition has to remain less than or equal to a specified value. In this 
case one can always choose the cross section of the beam to be very large initially so 
that the constraint remains satisfied. The only problem is that the weight of the beam 
(objective) corresponding to this initial design will be very large. Thus in most of the 
practical problems, we will be able to find a feasible starting point at the expense of a 
large value of the objective function. However, there may be some situations where the 
feasible design points could not be found so easily. In such cases, the required feasible 
starting points can be found by using the interior penalty function method itself as 
follows: 

1. Choose an arbitrary point X] and evaluate the constraints gj ( X ) at the point 
Xi. Since the point Xi is arbitrary, it may not satisfy all the constraints with 
strict inequality sign. If r out of a total of m constraints are violated, renumber 
the constraints such that the last r constraints will become the violated ones, 
that is, 

gj (X | ) < 0, j = 1,2, 

gj (X i) > 0, j — m — r + 1, m — r + 2, . . . , m (7.161) 

2. Identify the constraint that is violated most at the point Xi, that is, find the 
integer k such that 

g*(Xi) = max[g ; (X j)] 

for j — m — r + 1 , m — r + 2, . . . , m (7.162) 

3. Now formulate a new optimization problem as 

Find X which minimizes g*(X) 

subject to 

gj(X)<0, j = 1,2, .. . ,m -r 
gj(X) - gt(Xi) <0, j = m — r + l,m — r +2, , 

k-l,k + l,...,m (7.163) 

4 . Solve the optimization problem formulated in step 3 by taking the point X i as 
a feasible starting point using the interior penalty function method. Note that 
this optimization method can be terminated whenever the value of the objective 
function gk (X ) drops below zero. Thus the solution obtained X m will satisfy at 
least one more constraint than did the original point X i . 
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5. If all the constraints are not satisfied at the point X«, set the new starting point 
as Xi = Xu, and renumber the constraints such that the last r constraints will 
be the unsatisfied ones (this value of r will be different from the previous value), 
and go to step 2. 

This procedure is repeated until all the constraints are satisfied and a point X i = 
X M is obtained for which gj(X i) <0, j — 1,2 , ,m. 

If the constraints are consistent, it should be possible to obtain, by applying the 
procedure, a point X ] that satisfies all the constraints. However, there may exist situa- 
tions in which the solution of the problem formulated in step 3 gives the unconstrained 
or constrained local minimum of g k (X) that is positive. In such cases one has to start 
afresh with a new point X i from step 1 onward. 

Initial Value of the Penalty Parameter (r\). Since the unconstrained minimization 
of 0(X, r k ) is to be carried out for a decreasing sequence of r k , it might appear that by 
choosing a very small value of r i, we can avoid an excessive number of minimizations 
of the function 0. But from a computational point of view, it will be easier to minimize 
the unconstrained function 0(X, r k ) if r k is large. This can be seen qualitatively from 
Fig. TAOb. As the value of r k becomes smaller, the value of the function 0 changes 
more rapidly in the vicinity of the minimum <pf Since it is easier to find the minimum of 
a function whose graph is smoother, the unconstrained minimization of 0 will be easier 
if r k is large. However, the minimum of (j> k , X* k , will be farther away from the desired 
minimum X* if r k is large. Thus it requires an excessive number of unconstrained 
minimizations of 0(X, r k ) (for several values of r k ) to reach the point X* if r\ is 
selected to be very large. Thus a moderate value has to be choosen for the initial 
penalty parameter (r\). In practice, a value of r\ that gives the value of 0(Xi,ri) 
approximately equal to 1.1 to 2.0 times the value of /(X i) has been found to be quite 
satisfactory in achieving quick convergence of the process. Thus for any initial feasible 
starting point X i , the value of r\ can be taken as 

/(X i) 

n -01 to TO > ” - (7.164) 

L-j 7 = 1 !/^( x t) 

Subsequent Values of the Penalty Parameter. Once the initial value of r k is chosen, 
the subsequent values of r k+ \ have to be chosen such that 

r k+ i<r k (7.165) 

For convenience, the values of r k are chosen according to the relation 

n+\=cr k (7.166) 

where c < 1. The value of c can be taken as 0.1, 0.2, or 0.5. 

Convergence Criteria. Since the unconstrained minimization of 0(X, r k ) has to be 
carried out for a decreasing sequence of values r k , it is necessary to use proper con- 
vergence criteria to identify the optimum point and to avoid an unnecessarily large 
number of unconstrained minimizations. The process can be terminated whenever the 
following conditions are satisfied. 
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1. The relative difference between the values of the objective function obtained 
at the end of any two consecutive unconstrained minimizations falls below a 
small number £ 1 , that is. 


/(XP-/(X*_,) 

/(Xp 


< £1 


(7.167) 


2 . The difference between the optimum points X| and Xp, becomes very small. 
This can be judged in several ways. Some of them are given below: 


I (AX),- 1 <£2 


(7.168) 


where AX = X* k — Xp p and (AX),- is the ;th component of the vector AX. 

max | (AX),- 1 < £3 (7.169) 

| AX | = [(AX)? + (AX)? + ■ ■ ■ + (AX )?] 1 / 2 < £4 (7.170) 

Note that the values of £1 to £4 have to be chosen depending on the character- 
istics of the problem at hand. 


Normalization of Constraints. A structural optimization problem, for example, might 
be having constraints on the deflection (5) and the stress (<r) as 

g!(X) = 5(X)-5 max <0 (7.171) 

g 2 (X) = er(X) — er max < 0 (7.172) 


where the maximum allowable values are given by <S max = 0.5 in. and er max = 
20,000 psi. If a design vector X/ gives the values of g\ and g 2 as —0.2 and —10,000, 
the contribution of gi will be much larger than that of gi (by an order of 10 4 ) 
in the formulation of the cf> function given by Eq. (7.160). This will badly affect 
the convergence rate during the minimization of f function. Thus it is advisable to 
normalize the constraints so that they vary between —1 and 0 as far as possible. For 
the constraints shown in Eqs. (7.171) and (7.172), the normalization can be done as 




, g,(X) er(X) 

g' 2 (X) = — — = — — - 1 < 0 


a„ 


(7.173) 

(7.174) 


If the constraints are not normalized as shown in Eqs. (7.173) and (7.174), the problem 
can still be solved effectively by defining different penalty parameters for different 
constraints as 


0(X, r k ) = /(X) - r k ^ 
j = 1 


Rj 

8 ;(X) 


(7.175) 


where R\, Ab, . . . , R m are selected such that the contributions of different gj ( X ) to the 
(p function will be approximately the same at the initial point X ] . When the uncon- 
strained minimization of c/> ( X , i\) is carried for a decreasing sequence of values of 
r k , the values of R \ , Ri, ■ ■ ■ , R„, will not be altered; however, they are expected to be 
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effective in reducing the disparities between the contributions of the various constraints 
to the 0 function. 

Example 7.7 

Minimize f(x\, X 2 ) = |(xi + l) 3 + X 2 


subject to 


gl(*l,X 2 ) - -x\ + 1 < 0 


g 2 (xi,x 2 ) = -x 2 < 0 


SOLUTION To illustrate the interior penalty function method, we use the calculus 
method for solving the unconstrained minimization problem in this case. Hence there 
is no need to have an initial feasible point X i . The 0 function is 


0(X, r) = ]-{x\ + l) 3 +x 2 -r( ] —— - — ) 

3 \ x i + 1 x 2 / 

To find the unconstrained minimum of 0, we use the necessary conditions: 

= (*i + l) 2 - — T = 0, that is, (xf — \ ) 2 — r 
axi (1 — xi )- 

30 r 7 

= 1 r = 0, that is, xi — r 

3x 2 x\ 2 


These equations give 


x*(r) = (r 1 / 2 + l) 1/z , x| (r ) — r 

0min (r) = i[(r'/ 2 + l) 1 / 2 + l] 3 + 2 r x ' 2 


1/2 


M 2 


1 


3 LV 1 '■> ' 1 (1/r)— (l/r 3 /-+l/r 2 ) 1 / 2 

To obtain the solution of the original problem, we know that 


/min — fini 0m in O' ) 

r— >• 0 

x* — lim x* (r) 

r— >• 0 

Xn = limxnCr) 

The values of /, x*, and x| corresponding to a decreasing sequence of values of r are 
shown in Table 7.3. 


Example 7.8 


subject to 


Minimize /(X) = x 3 — 6x 2 + llxi + X3 

x 2 + x| — x 2 < 0 
4 - x 2 - x\ - x 2 < 0 
X3 — 5 < 0 


— . Xj <0, i — 1,2,3 
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Table 7.3 Results for Example 7.7 


Value of r 

x*(r) = (r 1 / 2 + l) 1 / 2 

5*. 

II 

S*. 

* C4 

X 

0min (r ) 

m 

1000 

5.71164 

31.62278 

376.2636 

132.4003 

100 

3.31662 

10.00000 

89.9772 

36.8109 

10 

2.04017 

3.16228 

25.3048 

12.5286 

1 

1.41421 

1.00000 

9.1046 

5.6904 

0.1 

1.14727 

0.31623 

4.6117 

3.6164 

0.01 

1.04881 

0.10000 

3.2716 

2.9667 

0.001 

1.01569 

0.03162 

2.8569 

2.7615 

0.0001 

1.00499 

0.01000 

2.7267 

2.6967 

0.00001 

1.00158 

0.00316 

2.6856 

2.6762 

0.000001 

1.00050 

0.00100 

2.6727 

2.6697 

Exact solution 0 

1 

0 

8/3 

8/3 


SOLUTION The interior penalty function method, coupled with the Davidon-FIetcher 
-Powell method of unconstrained minimization and cubic interpolation method of 
one-dinrensional search, is used to solve this problem. The necessary data are assumed 
as follows: 


Starting feasible point, X j = 


0.1 

0.1 

3.0 


r\ = 1.0, /(XO =4.041, 0(X 1 ,r 1 ) = 25.1849 


The optimum solution of this problem is known to be [7.15] 


X = 


0 

s/2 

V2 


/* = V 2 


The results of numerical optimization are summarized in Table 7.4. 


Convergence Proof. The following theorem proves the convergence of the interior 
penalty function method. 

Theorem 7.1 If the function 


m j 

0(X,r A .) = /(X)-r,V— — (7.176) 

U 8j(X) 

is minimized for a decreasing sequence of values of r, t, the unconstrained minima X J 
converge to the optimal solution of the constrained problem stated in Eq. (7.153) as 
r k 0. 
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Table 7.4 Results for Example 7.8 


k Value of r k 


Number of 

Starting point iterations taken 

for minimizing for minimizing 

<Pk &k 


Optimum 


r k ft 


1 1.0 x 10 u 


2 1.0 x 10“' 


3 1.0 x 10 


-2 


-3 


4 1.0 x 10 


5 1.0 x 10“ 4 


6 1.0 x 10 


-5 


-6 


0.1 
0.1 
3.0 

”0.37898” 

1.67965 

_2.34617_ 

”0.10088“ 
1.41945 
_1. 68302 

”0.03066” 

1.41411 

_1.49842_ 

'0.009576' 

1.41419 

1.44081 


7 1.0 x 10 


8 1.0 x 10“ 


9 1.0 x 10“ 


10 1.0 x 10 -9 


11 1.0 x 10“ 10 


12 1.0 x 10“ 


13 1.0 x 10“ 12 


0.000003011 

1.41421 

1.41422 

0.9562 x 10“ 6 ' 

1.41421 

1.41422 


0.37898 

1.67965 

2.34617_ 

”0.10088“ 

1.41945 

1.68302 

”0.03066” 

1.41411 

1.49842_ 

'0.009576' 

1.41419 

1.44081 

'0.003020' 

1.41421 

1.42263 


10.36219 5.70766 


4.12440 2.73267 


2.25437 1.83012 


1.67805 1.54560 


1.49745 1.45579 


”0.003020” 

1.41421 

1.42263 

3 

"0.0009530" 

1.41421 

1.41687 

1.44052 

1.42735 

”0.0009530“ 

1.41421 

1.41687 

3 

"0.0003013" 

1.41421 

1.41505 

1.42253 

1.41837 

”0.0003013“ 

1.41421 

1.41505 

3 

"0.00009535” 

1.41421 

1.41448 

1.41684 

1.41553 

”0.00009535" 

1.41421 

1.41448 

5 

"0.00003019“ 

1.41421 

1.41430 

1.41505 

1.41463 

”0.00003019" 

1.41421 

1.41430 

4 

'0.000009567 

1.41421 

1.41424 

1.41448 

1.41435 

"0.000009567 

1.41421 

1.41424 

3 

"0.00003011” 

1.41421 

1.41422 

1.41430 

1.41426 


0.9562 x 10“ 6 ' 

1.41421 

1.41422 

'0.3248 x 10 -6 ’ 
1.41421 
1.41421 


1.41424 1.41423 


1.41422 1.41422 
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Proof: If X* is the optimum solution of the constrained problem, we have to prove 
that 


lim [min0(X, r k )] = f(X* k , r k ) = /(X*) 
a-*- o 


(7.177) 


Since /(X) is continous and /(X*) < /(X) for all feasible points X, we can choose 
feasible point X such that 


/(X)</(X*) + - 

for any value of e > 0. Next select a suitable value of k, say K, such that 

r k < 


f £ / _ 

1 

l 2/7! / j 

L Sj (X ) J J 


From the definition of the <j> function, we have 

/(X*) < min0(X, r k ) = <p(X* k , r k ) 
where X k is the unconstrained minimum of 0 (X , r k ). Further, 

0(X^, r k ) < r k ) 


(7.178) 


(7.179) 


(7.180) 


(7.181) 


since X' k minimizes 0(X, r k ) and any X other than X k leads to a value of cf> greater 
than or equal to (/>(X k , r k ). Further, by choosing r k < r K , we obtain 

m ^ 

< K X* K ,r K ) = f(X* K )-r K J2—^ 

j = l K> 

m , 

> f( X*)-r k T 

K> k f^gj(X* K ) 

>0(X£, r k ) 

as X? is the unconstrained minimum of e/> ( X . r k ) . Thus 


But 


/(X*) < 0(X*, r k ) < 4>(X* k , r k ) < 0(X* , r K ) 


m j 

0(XJ, r/f) < <p(X,r K ) = /(X) - r K ^ — — - 

7=1 


Combining the inequalities (7.183) and (7.184), we have 

1 

/(X* } < 0(X£, r k ) < /(X) — r K ^ — - 


J=i 8ja) 


(7.182) 

(7.183) 

(7.184) 

(7.185) 


Inequality (7.179) gives 
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By using inequalities (7.178) and (7.186), inequality (7.185) becomes 

/(X*) < 4>(X* k , n) < /(X *) + e - + | = /(X*) + £ 

or 

cP(X* k ,r k )-f(X*)<e (7.187) 

Given any s > 0 (however small it may be), it is possible to choose a value of k so as 
to satisfy the inequality (7.187). Hence as k — » oo (r k — >• 0), we have 

lim <p(X%, r k ) = f (X*) 

This completes the proof of the theorem. 


Additional Results. From the proof above, it follows that as r k — »■ 0, 


lim f(X* k ) = f(X*) 


lim r k 

k^-oo 


m 


E 


1 


= 0 


(7.188) 

(7.189) 


It can also be shown that if r\, r 2 , . . . is a strictly decreasing sequence of positive values, 
the sequence f(X*), f(X* 2 ). . . . will also be strictly decreasing. For this, consider two 
consecutive parameters, say, r k and r k+ \, with 


0 < r k+ 1 < r k 


(7.190) 


Then we have 


m i m i 

/(X? + 1 ) - G+, E — T7— < /(Xp - r* +1 E —5777 
j=\ 8j'-k+t> /=1 Sj( A k ) 


since XFj alone minimizes 0(X, rr+i). Similarly, 


(7.191) 


m ^ m i 


(7.192) 


Divide Eq. (7.191) by r^+i , Eq. (7.192) by r k , and add the resulting inequalities to 
obtain 

1 "'ll 1 


1 "'ll m i 

<rrT/' x ;> - E imnr + - £ ztv 


r, t+i 


tr^-(Xp n 


£ f;(XJ +1 ) 


(7.193) 
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Canceling the common terms from both sides, we can write the inequality (7.193) as 


since 


we obtain 


/(XJ+i) 




< /(X*) 




_J 1_ _ r k - r t+ i 

Tfc+i r k r k r k+ i 


(7.194) 


(7.195) 


/(X; +1 ) < /(XJ) 


(7.196) 


7.14 CONVEX PROGRAMMING PROBLEM 

In Section 7.13 we saw that the sequential minimization of 


m ^ 

0(X,») = /(X)-r*V — - it-, r*>0 


(7.197) 


for a decreasing sequence of values of r k gives the minima X^. As k oo, these points 
X^ converge to the minimum of the constrained problem: 


subject to 


Minimize /(X) 
g/(X) < 0, j — 1,2, ... ,m 


(7.198) 


To ensure the existence of a global minimum of 0(X, r k ) for every positive value 
of r k , 0 has to be strictly convex function of X. The following theorem gives the 
sufficient conditions for the 0 function to be strictly convex. If 0 is convex, for every 
r k > 0 there exists a unique minimum of 0(X, r k ). 


Theorem 7.2 If /(X) and g ; (X) are convex and at least one of /(X) and g ; -(X) is 
strictly convex, the function 0(X, r k ) defined by Eq. (7.197) will be a strictly convex 
function of X. 


Proof: This theorem can be proved in two steps. In the first step we prove that if a 
function g k (X) is convex, 1 /g k (K) will be concave. In the second step, we prove that 
a positive combination of convex functions is convex, and strictly convex if at least 
one of the functions is strictly convex. 

Thus Theorem A. 3 of Appendix A guarantees that the sequential minimization of 
0(X, r k ) for a decreasing sequence of values of r k leads to the global minimum of the 
original constrained problem. When the convexity conditions are not satisfied, or when 
the functions are so complex that we do not know beforehand whether the convexity 
conditions are satisfied, it will not be possible to prove that the minimum found by the 
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SUMT method is a global one. In such cases one has to satisfy with a local minimum 
only. However, one can always reapply the SUMT method from different feasible 
starting points and try to find a better local minimum point if the problem has several 
local minima. Of course, this procedure requires more computational effort. 


7.15 EXTERIOR PENALTY FUNCTION METHOD 

In the exterior penalty function method, the 0 function is generally taken as 

m 

0(X, r k ) = /(X) + r k ( gj (X)) q (7.199) 

7=i 

where r k is a positive penalty parameter, the exponent q is a nonnegative constant, and 
the bracket function (g ; (X)) is defined as 

<*7(X)> = max(g ; (X), 0) 

gj(X) if g/ (X ) > 0 

(constraint is violated) 

= 0 ifg;(X)<0 

(constraint is satisfied) 


It can be seen from Eq. (7.199) that the effect of the second term on the right side is to 
increase 0(X , r k ) in proportion to the c/th power of the amount by which the constraints 
are violated. Thus there will be a penalty for violating the constraints, and the amount of 
penalty will increase at a faster rate than will the amount of violation of a constraint (for 
q > 1). This is the reason why the formulation is called the penalty function method. 
Usually, the function 0(X , r k ) possesses a minimum as a function of X in the infeasible 
region. The unconstrained minima XJ converge to the optimal solution of the original 
problem as k — > oo and r k — > oo. Thus the unconstrained minima approach the feasible 
domain gradually, and as k -» oo, the X* k eventually lies in the feasible region. Let us 
consider Eq. (7.199) for various values of q. 

1. q — 0. Here the 0 function is given by 


0(X,rO = /(X) + r*£<*,-(X)>° 

7=1 

_j /(X)+mr t if all gj (X ) > 0 
l/(X) if all gj (X ) < 0 


(7.201) 


This function is discontinuous on the boundary of the acceptable region as 
shown in Fig. 7.11 and hence it would be very difficult to minimize this function. 

2. 0 < q < 1 . Here the 0 function will be continuous, but the penalty for violating 
a constraint may be too small. Also, the derivatives of the function are discon- 
tinuous along the boundary. Thus it will be difficult to minimize the 0 function. 
Typical contours of the 0 function are shown in Fig. 7.12. 
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Figure 7.11 A 4> function discontinuous for q = 0. 




Figure 7.12 Derivatives of a (f> function discontinuous for 0 < q < 1. 


3. q = 1. In this case, under certain restrictions, it has been shown by Zangwill 
[7.16] that there exists an rq so large that the minimum of </;(X , r^) is exactly 
the constrained minimum of the original problem for all > rq. However, the 
contours of the (p function look similar to those shown in Fig. 7.12 and possess 
discontinuous first derivatives along the boundary. Hence despite the conve- 
nience of choosing a single r* that yields the constrained minimum in one 
unconstrained minimization, the method is not very attractive from computa- 
tional point of view. 


7.15 Exterior Penalty Function Method 445 


<P,f 



/ 

/ 

/ 

/ 

/ 

/ 

/ 

/ 

/ 

/ 

/ 

/ 

✓ 

/ 

/ 

/ 

/ 

/ 

/ 

/ 

/ 



B 

Section on A - A 


0 


x\ 


P 


(a) 


( 6 ) 


Figure 7.13 A </> function for q > 1. 


4. q > 1 . The </> function will have continuous first derivatives in this case as shown 
in Fig. 7.13. These derivatives are given by 


Generally, the value of q is chosen as 2 in practical computation. We assume a 
value of q > 1 in subsequent discussion of this method. 

Algorithm. The exterior penalty function method can be stated by the following 
steps: 

1. Start from any design X i and a suitable value of r\ . Set k = 1 . 

2. Find the vector that minimizes the function 


3. Test whether the point X| satisfies all the constraints. If X^ is feasible, it is the 
desired optimum and hence terminate the procedure. Otherwise, go to step 4. 

4. Choose the next value of the penalty parameter that satisfies the relation 



m 


(7.202) 


m 


cKX,r k ) = f(X) + r k J2(gj(X)) q 


n + 1 > r k 


and set the new value of k as original k plus 1 and go to step 2. Usually, 
the value of r k+ \ is chosen according to the relation r k+ \ = cr k , where c is a 
constant greater than 1. 
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Example 7.9 


Minimize f(x\, x 2 ) = + l) 3 + X 2 


subject to 


gl(*l,* 2 ) = 1 - x\ < 0 
g 2 (xi,x 2 ) = -x 2 < 0 


SOLUTION To illustrate the exterior penalty function method, we solve the uncon- 
strained minimization problem by using differential calculus method. As such, it is not 
necessary to have an initial trial point X i . The 0 function is 

0 (X i , r ) = \(x\ + l) 3 +x 2 + r[max(0, 1 - .ri)] 2 + r[max(0, -x 2 )] 2 

The necessary conditions for the unconstrained minimum of 0(X, r) are 

— - = (xi + l) 2 — 2r[max(0, 1 — jcj )] = 0 
dx\ 

30 

- — = 1 - 2r[max(0, -x 2 )] = 0 
dx 2 

These equations can be written as 

min[(jci + l) 2 , (jci + l) 2 - 2r(l - jci)] = 0 (Ej) 

min[l, 1 + lrx 2 \ — 0 (E 2 ) 

In Eq. (Ei), if (x\ + l) 2 = 0, x\ — —1 (this violates the first constraint), and if 

(xi + l) 2 — 2r(l — xi) — 0, x\ — — 1 — r + Vr 2 + 4r 

In Eq. (E 2 ), the only possibility is that 1 + 2 rx 2 — 0 and hence x 2 = —1/2 r. Thus the 
solution of the unconstrained minimization problem is given by 

* ( 4\'/ 2 

x*(r) — —1 — r + r M + - 1 (E 3 ) 

xt(r) = ~ (E 4 ) 

2 r 

From this, the solution of the original constrained problem can be obtained as 


xt = lim Jtf(r) = 1, 4 = lim xUr) = 0 

k — >■ oo r—>cc 

/min = blTl (f) min(^) = 3 


The convergence of the method, as r increases gradually, can be seen from Table 7.5. 
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Table 7.5 Results for Example 7.9 


Value of r 

X* 

X* 

0min (j ” ) 

fmin(r) 

0.001 

-0.93775 

-500.00000 

-249.9962 

-500.0000 

0.01 

-0.80975 

-50.00000 

-24.9650 

-49.9977 

0.1 

-0.45969 

-5.00000 

-2.2344 

-4.9474 

1 

0.23607 

-0.50000 

0.9631 

0.1295 

10 

0.83216 

-0.05000 

2.3068 

2.0001 

100 

0.98039 

-0.00500 

2.6249 

2.5840 

1,000 

0.99800 

-0.00050 

2.6624 

2.6582 

10,000 

0.99963 

-0.00005 

2.6655 

2.6652 

OO 

1 

0 

8 

3 

8 

3 


Convergence Proof. To prove the convergence of the algorithm given above, we 
assume that / and gj, j — 1,2 , ,m, are continuous and that an optimum solution 
exists for the given problem. The following results are useful in proving the convergence 
of the exterior penalty function method. 

Theorem 7.3 If 


0(X, r k ) = /(X) + r*G[g(X)] = /(X) + r k £ (g ; (X))« 

7=1 

the following relations will be valid for any 0 < < /y +l : 

1. 0(XJ,r*) <0(XJ +1 ,r*+i). 

2. /(Xp < /(X* +1 ). 

3. G[g(Xp] > G[g(X* +1 )]. 

Proof: The proof is similar to that of Theorem 7.1. 

Theorem 7.4 If the function </>(X, rp given by Eq. (7.199) is minimized for an increas- 
ing sequence of values of /y, the unconstrained minima X'f converge to the optimum 
solution (X*) of the constrained problem as — > oo. 

Proof-. The proof is similar to that of Theorem 7.1 (see Problem 7.46). 


7.16 EXTRAPOLATION TECHNIQUES IN THE INTERIOR 
PENALTY FUNCTION METHOD 

In the interior penalty function method, the 4> function is minimized sequentially for 
a decreasing sequence of values r\> r 2 > ■ ■ ■ > r^ to find the unconstrained minima 
X*, Xp . . . , Xp respectively. Let the values of the objective function corresponding to 
X, , Xp . . . , X | be /*, f * , . . . , ff, respectively. It has been proved that the sequence 
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X *, X^, . . . , X£ converges to the minimum point X*, and the sequence /*, / 2 *, . . . , 
to the minimum value /* of the original constrained problem stated in Eq. (7.153) as 
r k -» 0. After carrying out a certain number of unconstrained minimizations of 0, the 
results obtained thus far can be used to estimate the minimum of the original constrained 
problem by a method known as the extrapolation technique. The extrapolations of the 
design vector and the objective function are considered in this section. 


7.16.1 Extrapolation of the Design Vector X 

Since different vectors X*, i = 1,2, . . . ,k, are obtained as unconstrained minima of 
0(X, r,) for different r,-, i — 1,2 , ,k, the unconstrained minimum 0 ( X , r) for any 
value of r, X*(r), can be approximated by a polynomial in r as 
k-\ 

X*(r) - J2 A J (r)J = Ao + + ^2 + ' ' • + (7.203) 

;'=o 

where A ; are //-component vectors. By substituting the known conditions 

X*(r — rj) = X*, i = 1,2, ... ,k (7.204) 

in Eq. (7.203), we can determine the vectors Aj, j = 0, 1, 2, . . . , k — 1 uniquely. Then 
X*(r), given by Eq. (7.203), will be a good approximation for the unconstrained min- 
imum of 0(X, r ) in the interval (0, r i). By setting r — 0 in Eq. (7.203), we can obtain 
an estimate to the true minimum, X*, as 

X* = X*(r = 0) = A 0 (7.205) 


It is to be noted that it is not necessary to approximate X*(r) by a (k — 1) st-order 
polynomial in r. In fact, any polynomial of order 1 < p < k — 1 can be used to approx- 
imate X*(r). In such a case we need only p + 1 points out of X*, X 2 , . . . , X£ to define 
the polynomial completely. 

As a simplest case, let us consider approximating X*(r) by a first-order polynomial 
(linear equation) in r as 

X*(r) = Ao + rAi (7.206) 


To evaluate the vectors Ao and Ai, we need the data of two unconstrained minima. If 
the extrapolation is being done at the end of the /:th unconstrained minimization, we 
generally use the latest information to find the constant vectors Ao and A | . Let X|_ t 
and X ; * be the unconstrained minima corresponding to >\-\ and /y , respectively. Since 
n = cr k - 1 (c < 1), Eq. (7.206) gives 


X*(r = r k ~\) = A 0 + /•<._! A j = X£_j 
X*(r = r k ) = A 0 + cr k ^ x A\ = X* k 


(7.207) 


These equations give 


A 0 = 


X^-cX 


* 

k - 1 


1 — C 


A, 


Y* Y * 

A k - 1 ~ A k 

n- t(i - c) 


(7.208) 
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From Eqs. (7.206) and (7.208), the extrapolated value of the true minimum can be 
obtained as 

X* — rX* 

X*(r = 0) = A 0 = — — (7.209) 

1 — c 

The extrapolation technique [Eq. (7.203)] has several advantages: 

1. It can be used to find a good estimate to the optimum of the original problem 
with the help of Eq. (7.205). 

2. It can be used to provide an additional convergence criterion to terminate the 
minimization process. The point obtained at the end of the Hh iteration, X£, 
can be taken as the true minimum if the relation 

\X* k -X*(r =0)| < e (7.210) 

is satisfied, where e is the vector of prescribed small quantities. 

3. This method can also be used to estimate the next minimum of the 0 function 
after a number of minimizations have been completed. This estimate 1 can be 
used as a starting point for the ( k + l)st minimization of the 0 function. The 
estimate of the ( k + l)st minimum, based on the information collected from the 
previous k minima, is given by Eq. (7.203) as 

X^ +1 ~ X*(r = r k+ \ — r\c k ) 

= A 0 + (/re*) A! + (r lC *) 2 A 2 + ■ ■ ■ + A k -dr lC k ) k - 1 (7.211) 
If Eqs. (7.206) and (7.208) are used, this estimate becomes 
Xjt+i — X*(r — c 2 r*£_i) = A 0 + c 2 r k _ib 1 

= d +c)X* k -cX* k -i (7-212) 

Discussion. It has been proved that under certain conditions, the difference between 
the true minimum X* and the estimate X*(r = 0) = Ao will be of the order r\ [7.17]. 
Thus as r \ -> 0, A 0 — > X*. Moreover, if r\ < 1, the estimates of X* obtained by 
using k minima will be better than those using (k — I ) minima, and so on. Hence as 
more minima are achieved, the estimate of X* or X£ +1 presumably gets better. This 
estimate can be used as the starting point for the ( k + l)st minimization of the 0 
function. This accelerates the entire process by substantially reducing the effort needed 
to minimize the successive 0 functions. However, the computer storage requirements 
and accuracy considerations (such as numerical round-off errors that become important 
for higher-order estimates) limit the order of polynomial in Eq. (7.203). It has been 
found in practice that extrapolations with the help of even quadratic and cubic equations 
in r generally yield good estimates for X j' +| and X*. Note that the extrapolated points 
given by any of Eqs. (7.205), (7.209), (7.211), and (7.212) may sometimes violate the 
constraints. Hence we have to check any extrapolated point for feasibility before using 
it as a starting point for the next minimization of 0. If the extrapolated point is found 
infeasible, it has to be rejected. 

+ The estimate obtained for X* can also be used as a starting point for the (k + l)st minimization of the (f> 
function. 
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7.16.2 Extrapolation of the Function / 

As in the case of the design vector, it is possible to use extrapolation technique 
to estimate the optimum value of the original objective function, /*. For this, let 
f*, f 2 , , f k be the values of the objective function corresponding to the vectors 
Xj\ X^, . . . , X A . Since the points X *, X*, . . . , X| have been found to be the uncon- 
strained minima of the 0 function corresponding to i'\, n, . . . ,r k , respectively, the 
objective function, /*, can be assumed to be a function of r. By approximating /* by 
a (k — l)st-order polynomial in r, we have 
k - 1 

/ *(r) = ^a ; (r) ; = a 0 + a\r + a 2 r 2 T \- a k -\r k ~ x (7.213) 

r=o 

where the k constants aj, j = 0, 1, 2, . . . , k — 1 can be evaluated by substituting the 
known conditions 

f*(r = r i ) = f*=ao + air i +a 2 r?-\ ha k -irf~ l , i = 1,2,...,* (7.214) 

Since Eq. (7.213) is a good approximation for the true /* in the interval (0, r \ ), we 
can obtain an estimate for the constrained minimum of / as 

/* ~ f*(r = 0) = a 0 (7.215) 


As a particular case, a linear approximation can be made for /* by using the last two 
data points. Thus if / A *_ x and f k are the function values corresponding to r k ~\ and 
r k = cr k ~ i, we have 


These equations yield 


f*(r) 


fk - i = «o + r k -\ai 
fk = «o + cr k _ i«i 


a 0 = 


£/l = 


1 -C 

A-i - /; 




nt-i (i 

Cfk - 1 , 


- c) 

n-i i - c 


(7.216) 


(7.217) 

(7.218) 

(7.219) 


Equation (7.219) gives an estimate of /* as 


/* ~ f*(r = 0) = = 


1 -c 


(7.220) 


The extrapolated value ao can be used to provide an additional convergence criterion 
for terminating the interior penalty function method. The criterion is that whenever the 
value of f k obtained at the end of fcth unconstrained minimization of 0 is sufficiently 
close to the extrapolated value ao, that is, when 


fk ~ ao 
fk 


< e 


where e is a specified small quantity, the process can be terminated. 


(7.221) 
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Example 7.10 Find the extrapolated values of X and / in Example 7.8 using the 
results of minimization of 0(X, r\) and 0(X, rf). 


SOLUTION From the results of Example 7.8, we have for r\ — 1 .0, 


X 


* 

1 


0.37898' 
1.67965 , 
2.34617 


f* = 5.70766 


and for r 2 =0.1, 


c = 0.1, X* 2 


0.10088 

1.41945 

1.68302 


/ 2 * = 2.73267 


By using Eq. (7.206) for approximating X*(r), the extrapolated vector X* is given by 
Eq. (7.209) as 


‘o = 


X ^-cXj 

1 — c 


0.06998 

1.39053 

1.60933 



0.10088 


0.37898 



1.41945 

-0.1 

1.67865 



1.68302 


2.34617 



(Ei) 


(E 2 ) 


Similarly, the linear resltionships f*(r) — a {) + a ] r leads to [from Eq. (7.220)] 

f* _ c f* i 

/* ~ — L = _[2.73267 - 0.1(5.707667)] = 2.4021 1 (E 3 ) 

It can be verified that the extrapolated design vector X* is feasible and hence can be 
used as a better starting point for the subsequent minimization of the function 0. 


7.17 EXTENDED INTERIOR PENALTY FUNCTION METHODS 

In the interior penalty function approach, the 0 function is defined within the feasible 
domain. As such, if any of the one-dimensional minimization methods discussed in 
Chapter 5 is used, the resulting optimal step lengths might lead to infeasible designs. 
Thus the one-dimensional minimization methods have to be modified to avoid this prob- 
lem. An alternative method, known as the extended interior penalty function method, 
has been proposed in which the 0 function is defined outside the feasible region. The 
extended interior penalty function method combines the best features of the interior and 
exterior methods for inequality constraints. Several types of extended interior penalty 
function formulations are described in this section. 

7.17.1 Linear Extended Penalty Function Method 

The linear extended penalty function method was originally proposed by Kavlie and 
Moe [7.18] and later improved by Cassis and Schrnit [7.19]. In this method, the (f>k 
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function is constructed as follows: 


where 


(, h — 0(X , r k ) = f(X) + r k ^2gj(X) 

j = i 


M*) = 


1 

MX) 

2 £ - g,(X) 


if ^(X) < e 


if gj(X) > s 


(7.222) 


(7.223) 


and £ is a small negative number that marks the transition from the interior penalty 
[g/(X) < e] to the extended penalty [gj(X ) > e]. To produce a sequence of improved 
feasible designs, the value of s is to be selected such that the function <j> k will have a 
positive slope at the constraint boundary. Usually, e is chosen as 


e = -c(r k ) a 


(7.224) 


where c and a are constants. The constant a is chosen such that | < a < j, where 
the value of a — | guarantees that the penalty for violating the constraints increases 
as r k goes to zero while the value of a — ] - is required to help keep the minimum 
point X* in the quadratic range of the penalty function. At the start of optimization, 
e is selected in the range —0.3 < £ < —0.1. The value of r\ is selected such that the 
values of / (X ) and r\ M/Li gj (X ) are equal at the initial design vector X i . This defines 
the value of c in Eq. (7.224). The value of £ is computed at the beginning of each 
unconstrained minimization using the current value of r k from Eq. (7.224) and is kept 
constant throughout that unconstrained minimization. A flowchart for implementing the 
linear extended penalty function method is given in Fig. 7.14. 


7.17.2 Quadratic Extended Penalty Function Method 

The 4> k function defined by Eq. (7.222) can be seen to be continuous with continuous 
first derivatives at gj(X) — e. However, the second derivatives can be seen to be 
discontinuous at gj(X) = s. Hence it is not possible to use a second-order method for 
unconstrained minimization [7.20]. The quadratic extended penalty function is defined 
so as to have continuous second derivatives at gj (X ) = £ as follows: 


0A- =0(X,r<.) = f(X) + r k ^2gj(X) 

j = i 


where 


MX) = 


MX) 


I- 1 

rMxn 

1 e 

£ 



if MX) < £ 

if g j (X ) > £ 


(7.225) 


(7.226) 


With this definition, second-order methods can be used for the unconstrained mini- 
mization of (j) k . It is to be noted that the degree of nonlinearity of (j> k is increased in 
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Figure 7.14 Linear extended penalty function method. 


Eq. (7.225) compared to Eq. (7.222). The concept of extended interior penalty function 
approach can be generalized to define a variable penalty function method from which 
the linear and quadratic methods can be derived as special cases [7.24]. 

Example 7.11 Plot the contours of the 0 a function using the linear extended interior 
penalty function for the following problem: 

Minimize /(x ) = (x — l) 2 


subject to 


g i(x) = 2 - x < 0 
g 2 (x) = x - 4 < 0 

SOLUTION We choose c = 0.2 and a — 0.5 so that e = —0.2 ^/r\. The 0a function 
is defined by Eq. (7.222). By selecting the values of as 10.0, 1.0, 0.1, and 0.01 
sequentially, we can determine the values of 0 a for different values of x, which can 
then be plotted as shown in Fig. 7.15. The graph of /(x) is also shown in Fig. 7.15 
for comparison. 


7.18 PENALTY FUNCTION METHOD FOR PROBLEMS 

WITH MIXED EQUALITY AND INEQUALITY CONSTRAINTS 

The algorithms described in previous sections cannot be directly applied to solve prob- 
lems involving strict equality constraints. In this section we consider some of the 
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methods that can be used to solve a general class of problems. 

Minimize /(X) 


subject to 


*7#) < 0, 

j = 1 , 2, . . . , m 

/ 7 (X) =0, 

j = 1,2,...,/? 


(7.227) 


7.18.1 Interior Penalty Function Method 

Similar to Eq. (7.154), the present problem can be converted into an unconstrained 
minimization problem by constructing a function of the form 

m p 

A = 0(X, r k ) = /(X) + r k £ Gj[gj&)] + H{r k ) /,-(X) (7.228) 

7=i y'=i 

where Gj is some function of the constraint gj tending to inbnity as the constraint 
boundary is approached, and H(r k ) is some function of the parameter r k tending to 
inbnity as r k tends to zero. The motivation for the third term in Eq. (7.228) is that as 
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H(rk) — » oo, the quantity E^ =1 /?(X) must tend to zero. If Sj' =1 /y(X) does not tend to 
zero, (pk would tend to infinity, and this cannot happen in a sequential minimization 
process if the problem has a solution. Fiacco and McCormick [7.17, 7.21] used the 
following form of Eq. (7.228): 


If (p k is minimized for a decreasing sequence of values the following theorem proves 
that the unconstrained minima X J will converge to the solution X* of the original 
problem stated in Eq. (7.227). 

Theorem 7.5 If the problem posed in Eq. (7.227) has a solution, the unconstrained min- 
ima, X|, of 0(X , rf), defined by Eq. (7.229) for a sequence of values r\ > > • • ■ > r*, 

converge to the optimal solution of the constrained problem [Eq. (7.227)] as r* — >• 0. 

Proof : A proof similar to that of Theorem 7.1 can be given to prove this theorem. 
Further, the solution obtained at the end of sequential minimization of (pk is guaranteed 
to be the global minimum of the problem, Eqs. (7.227), if the following conditions are 
satisfied: 

(i) /(X) is convex. 

(ii) g;(X), j = 1,2,..., m are convex. 

(iii) E? =1 /?(X) is convex in the interior feasible domain defined by the inequality 
constraints. 

(iv) One of the functions among /(X), gi(X), g 2 (X), • - - , g m (X), and E'' =1 /y(X) 
is strictly convex. 


1. To start the sequential unconstrained minimization process, we have to start from 
a point X i at which the inequality constraints are satisfied and not necessarily 
the equality constraints. 

2. Although this method has been applied to solve a variety of practical problems, 
it poses an extremely difficult minimization problem in many cases, mainly 
because of the scale disparities that arise between the penalty terms 


To solve an optimization problem involving both equality and inequality constraints as 
stated in Eqs. (7.227), the following form of Eq. (7.228) has been proposed: 



(7.229) 


Note: 



as the minimization process proceeds. 


7.18.2 Exterior Penalty Function Method 


m 


P 



(7.230) 
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As in the case of Eq. (7.199), this function has to 
sequence of values of r It can be proved that as - 


be minimized for an increasing 
-* oo, the unconstrained minima, 


X£, of 0(X, rk) converge to the minimum of the original constrained problem stated 
in Eq. (7.227). 


7.19 PENALTY FUNCTION METHOD FOR PARAMETRIC 
CONSTRAINTS 

7.19.1 Parametric Constraint 

In some optimization problems, a particular constraint may have to be satisfied over a 
range of some parameter (0) as 

g ; -(X, 0) < 0, 0/ < 0 < 0„ (7.231) 

where 0/ and 0„ are lower and the upper limits on 0, respectively. These types of 
constraints are called parametric constraints . As an example, consider the design of 
a four-bar linkage shown in Fig. 7.16. The angular position of the output link 0 will 
depend on the angular position of the input link, 0, and the lengths of the links, l\, In, 
Z3, and I 4 . If lj (i — I to 4) are taken as the design variables x, (i = I to 4), the angular 
position of the output link, 0(X, 0), for any fixed value of 0(0/) can be changed by 
changing the design vector, X. Thus if 0(0) is the output desired, the output 0(X, 0) 
generated will, in general, be different from that of 0(0), as shown in Fig. 7.17. If the 
linkage is used in some precision equipment, we would like to restrict the difference 
|0(0) — 0 (X ,0)| to be smaller than some permissible value, say, e. Since this restriction 
has to be satisfied for all values of the parameter 0, the constraint can be stated as a 
parametric constraint as 


|0(0) -0(X, 0)| < e, 0° < 0 < 360° (7.232) 

Sometimes the number of parameters in a parametric constraint may be more than 
one. For example, consider the design of a rectangular plate acted on by an arbitrary load 
as shown in Fig. 7.18. If the magnitude of the stress induced under the given loading, 
\ct{x,y)\, is restricted to be smaller than the allowable value er max , the constraint can 



Output 

link 


- h — 

Fixed Link 


Figure 7.16 Four-bar linkage. 
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i p{0)-<t>{&, X) 



Figure 7.17 Output angles generated and desired. 


y 



be stated as a parametric constraint as 

\a(x, y)| — cr m ax <0, 0 < x < a, 0 < y < b (7.233) 

Thus this constraint has to be satisfied at all the values of parameters x and y. 

7.19.2 Handling Parametric Constraints 

One method of handling a parametric constraint is to replace it by a number of ordinary 
constraints as 


g;(X,0;)< 0, i — 1,2 , ... ,r (7.234) 

where 8 \ , 62 , ... ,0 r are discrete values taken in the range of 9. This method is not 
efficient, for the following reasons: 

1. It results in a very large number of constraints in the optimization problem. 

2. Even if all the r constraints stated in Eq. (7.234) are satisfied, the constraint may 
still be violated at some other value of 9 [i.e., g t (X . 9) > 0 where 9k < 9 < 9k+ 1 
for some k]. 
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Another method of handling the parametric constraints is to construct the 0 function 
in a different manner as follows [7.1, 7.15]. 

Interior Penalty Function Method 


0(X, r k ) = /(X) - r k 



1 

gj(X,6) 


dd 


(7.235) 


The idea behind using the integral in Eq. (7.235) for a parametric constraint is to 
make the integral tend to infinity as the value of the constraint g 7 (X, 9) tends to zero 
even at one value of 9 in its range. If a gradient method is used for the unconstrained 
minimization of tp(X,r k ), the derivatives of 0 with respect to the design variables 
x i(i — 1,2,..., n ) are needed. Equation (7.235) gives 


90, Y * d f ih 

(X,r*) = — (A) + r k ) 

dXj 


dXj 


j = i L 


7 

J 6 1 


i 


d 8.i 


9, gjQt, 9) dXj 


(X, 9)d0 


(7.236) 


by assuming that the limits of integration, 9i and 9 U , are indepdnent of the design 
variables jq. Thus it can be noticed that the computation of 0(X, r k ) or 30 (X, r k )/dxj 
involves the evaluation of an integral. In most of the practical problems, no closed-form 
expression will be available for gj (X , 9), and hence we have to use some sort of a 
numerical integration process to evaluate 0 or 30/3x, . If trapezoidal rule [7.22] is used 
to evaluate the integral in Eq. (7.235), we obtain * 1 


0(X,r t ) = /(X) ~ r k y~] 

r= 1 


A 9 

r 1 11 

1 

2 

L gj(X,9,) gj (X,e u ) J 


r— 1 
P= 2 


1 

gj(X,0 P ) 


(7.237) 


^Let the interval of the parameter 9 be divided into r — 1 equal divisions so that 


6 \ = 6 i, 62 = 0\ + A0, 


62 , = 0\ + 2. A 0 , . . . , 0 r = 6 \ + (V — 1)A# = 6 U , 
9 U — 9i 


A 9 = 


1 


If the graph of the function (X , 9 ) looks as shown in Fig. 7.19, the integral of l/gj(X, 9) can be found 
approximately by adding the areas of all the trapeziums, like ABCD . This is the reason why the method is 
known as trapezoidal rule. The sum of all the areas is given by 



d9 

gj(X'8) 




Z=1 p= 1 L 


1 

gj(X.0p) 


1 

+ g J (X,0 ;l+1 ). 


A 9 

~T 


A9 

~Y 


1 


1 

g;(X,0„). 


r - 1 




A 9 

gj(X.9p) 
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Figure 7.19 Numerical integration procedure. 


where r is the number of discrete values of 0 , and A (9 is the uniform spacing between 
the discrete values so that 

6 \ — 8 / , 82 — 6 \ + A 9, 

0 3 =8i+ 2A9 , . . . , 0 r = 9i + (r - l)A0 = 8 U 

If 8 j<* , 9) cannot be expressed as a closed-form function of X, the derivative dgj/dxi 
occurring in Eq. (7.236) has to be evaluated by using some form of a finite-difference 
formula. 

Exterior Penalty Function Method 


0(X , r k ) = f(X) + r k 



(gj(X,9)) 2 d0 


(7.238) 


The method of evaluating <p(X , r k ) will be similar to that of the interior penalty function 
method. 


7.20 AUGMENTED LAGRANGE MULTIPLIER METHOD 
7.20.1 Equality-Constrained Problems 

The augmented Lagrange multiplier (ALM) method combines the Lagrange multiplier 
and the penalty function methods. Consider the following equality-constrained problem: 


Minimize /(X) 


(7.239) 
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subject to 


hj(X) = 0, 7 = 1,2,...,/?, /? < « 


(7.240) 


The Lagrangian corresponding to Eqs. (7.239) and (7.240) is given by 


p 


L(X,k) = /(X) + £A/A/(X) 


(7.241) 


where kj, j — 1,2,...,/?, are the Lagrange multipliers. The necessary conditions for 
a stationary point of L(X, A.) include the equality constraints, Eq. (7.240). The exterior 
penalty function approach is used to define the new objective function A(X, k, rf), 
termed the augmented Lagrangian function , as 


where is the penalty parameter. It can be noted that the function A reduces to the 
Lagrangian if r*. = 0 and to the 0 function used in the classical penalty function method 
if all kj =0. It can be shown that if the Lagrange multipliers are fixed at their optimum 
values k*:, the minimization of A(X, k, rf) gives the solution of the problem stated in 
Eqs. (7.239) and (7.240) in one step for any value of />. In such a case there is no need 
to minimize the function A for an increasing sequence of values of r&. Since the values 
of A* are not known in advance, an iterative scheme is used to find the solution of the 

problem. In the first iteration (k — 1), the values of A® are chosen as zero, the value 
of /> is set equal to an arbitrary constant, and the function A is minimized with respect 
to X to find X*®. The values of A® and are then updated to start the next iteration. 
Lor this, the necessary conditions for the stationary point of L, given by Eq. (7.241), 
are written as 


where A* denote the values of Lagrange multipliers at the stationary point of L. Simi- 
larly, the necessary conditions for the minimum of A can be expressed as 


p 


p 



(7.242) 



(7.243) 



(7.244) 


A comparison of the right-hand sides of Eqs. (7.243) and (7.244) yields 


A* =kj + 2 r k h j, y = l,2,...,/? 
These equations are used to update the values of kj as 


(7.245) 



(7.246) 
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where X {k) denotes the starting vector used in the minimization of A. The value of r> 
is updated as 


n+i = cr k , c> 1 (7.247) 

The function A is then minimized with respect to X to find X* (A+1) and the iterative 
process is continued until convergence is achieved for X (k) or X*. If the value of /-*.+ 1 
exceeds a prespecified maximum value r max , it is set equal to r max . The iterative process 
is indicated as a flow diagram in Fig. 7.20. 



Figure 7.20 Flowchart of augmented Lagrange multiplier method. 


462 


Nonlinear Programming III: Constrained Optimization Techniques 


7.20.2 Inequality-Constrained Problems 

Consider the following inequality-constrained problem: 

Minimize /(X) 


subject to 


g;(X)<0, j = 1 , 2 ,..., m 


(7.248) 


(7.249) 


To apply the ALM method, the inequality constraints of Eq. (7.249) are first converted 
to equality constraints as 

Sj (X) + Ty = 0, j — 1,2, ... ,m (7.250) 

where yj are the slack variables. Then the augmented Lagrangian function is con- 
structed as 

m m 

A(X, X, Y , r k ) = /(X) + £>,[g ; (X) + y 2 ] + (X) + y 2 ] 2 (7.251) 

7= i 7=i 

where the vector of slack variables, Y, is given by 


Y = 


y\ 

yi 

y,n 


If the slack variables yj, j — 1,2,..., m, are considered as additional unlorowns, the 
function A is to be minimized with respect to X and Y for specified values of a ; and 
r^. This increases the problem size. It can be shown [7.23] that the function A given 
by Eq. (7.251) is equivalent to 


where 


A(X , k, r k ) = /(X ) + + r kJ^ ol) (7.252) 

7=1 7=1 


otj =max\gj(X),-^- 


(7.253) 


Thus the solution of the problem stated in Eqs. (7.248) and (7.249) can be obtained by 
minimizing the function A, given by Eq. (7.252), as in the case of equality-constrained 
problems using the update formula 

kf H) = kf + 2r k af , j = 1,2,..., m (7.254) 


in place of Eq. (7.246). It is to be noted that the function A, given by Eq. (7.252), is 
continuous and has continuous first derivatives but has discontinuous second derivatives 
with respect to X at g ; (X) = —Xj/2r k . Hence a second-order method cannot be used 
to minimize the function A. 
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7.20.3 M ixed E quality- 1 nequality-C onstrained Problems 

Consider the following general optimization problem: 

Minimize /(X) (7.255) 

subject to 

g;(X)<0, j — 1,2, ... ,m (7.256) 

hj(X) = 0, j = 1,2, p (7.257) 

This problem can be solved by combining the procedures of the two preceding sections. 
The augmented Lagrangian function, in this case, is defined as 

m p 

A(X.X, r k ) = f (X ) T - kja j + k m + j h j (X ) 

7= i y=i 

m p 

+ (7.258) 

7=1 7=1 

where aj is given by Eq. (7.253). The solution of the problem stated in Eqs. (7.255) 
to (7.257) can be found by minimizing the function A, defined by Eq. (7.258), as in 
the case of equality-constrained problems using the update formula 

[ A® 1 

x (k+\) = ; « + 2rk max j gj (X), [ , j = \,2,...,m (7.259) 

^ + 2 r k hj(X), j = 1, 2, . . . , p (7.260) 

The ALM method has several advantages. As stated earlier, the value of r ^ need not 
be increased to infinity for convergence. The starting design vector, X (1) , need not 
be feasible. Finally, it is possible to achieve gj(X) — 0 and h t (X ) =0 precisely and 
the nonzero values of the Lagrange multipliers (a, ^ 0) identify the active contraints 
automatically. 

Example 7.12 

Minimize /(X) = 6xf + 4xiX2 + 7>x\ (Ej ) 


subject to 


h(X) = x\ + X 2 — 5 — 0 


(E 2 ) 


using the ALM method. 

SOLUTION The augmented Lagrangian function can be constructed as 
A(X, X, r/t) — 6x\ + 4.*iX 2 + 3xf + X(x\ + x 2 — 5) 

+ r k (x i+x 2 -5) 2 (E 3 ) 
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Table 7.6 Results for Example 7.12 


A® 

r k 

*(i) 

X 1 

*(0 
A 2 

Value of h 

0.00000 

1.00000 

-0.23810 

2.22222 

-3.01587 

-6.03175 

1.00000 

-0.38171 

3.56261 

-1.81910 

-9.66994 

1.00000 

-0.46833 

4.37110 

-1.09723 

-11.86441 

1.00000 

-0.52058 

4.85876 

-0.66182 

-13.18806 

1.00000 

-0.55210 

5.15290 

-0.39919 

-13.98645 

1.00000 

-0.57111 

5.33032 

-0.24078 

-14.46801 

1.00000 

-0.58257 

5.43734 

-0.14524 

-14.75848 

1.00000 

-0.58949 

5.50189 

-0.08760 

-14.93369 

1.00000 

-0.59366 

5.54082 

-0.05284 

-15.03937 

1.00000 

-0.59618 

5.56430 

-0.03187 


For the stationary point of A, the necessary conditions, dA/dXi = 0, i = 1, 2, yield 


Xi(l2 + 2 r k ) + x 2 (4 + 2 r k ) = lOr/.- - X 
x\ (4 + 2 r k ) + x 2 (6 + 2 r k ) = lOr/. - X 
The solution of Eqs. (E 4 ) and (E 5 ) gives 

—90 r\ + 9 r k X — 61 + 60r,t 


X\ = 


x 2 


(14-5r*)(12 + 2r*) 
20 r k - 2X 


14 - 5 r k 

Let the value of r k be fixed at 1 and select a value of l (l) = 0. This gives 


*d) 

r i 


71 ) 


*d) 


= f with h = -j I + f -5 = -3.01587 


20 


For the next iteration, 

1 (2) = 1 (1) + 2r k h(\* w ) = 0 + 2(1) ( — 3.01587) = -6.03175 
Substituting this value for X along with r k — 1 in Eqs. (Eg) and (E 7 ), we get 


*(2) 

C 1 


x* (2) = 3.56261 


= -0.38171, 

with h = -0.38171 +3.56261 - 5 = -1.81910 


(E 4 ) 

(Es) 

(Eg) 

(E 7 ) 


This procedure can be continued until some specified convergence is satisfied. The 
results of the first ten iterations are given in Table 7.6. 


7.21 CHECKING THE CONVERGENCE OF CONSTRAINED 
OPTIMIZATION PROBLEMS 

In all the constrained optimization techniques described in this chapter, identification 
of the optimum solution is very important from the points of view of stopping the 
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iterative process and using the solution with confidence. In addition to the convergence 
criteria discussed earlier, the following two methods can also be used to test the point 
for optimality. 


7.21.1 Perturbing the Design Vector 

Since the optimum point 


corresponds to the minimum function value subject to the satisfaction of the constraints 
< 0, j — 1,2, ... ,m (the equality constraints can also be included, if neces- 
sary), we perturb X* by changing each of the design variables, one at a time, by a 
small amount, and evaluate the values of / and gj, j — 1,2 , ,m. Thus if 

X+ =X* + AX, 

Xr = X* — AX, 


where 


AX,- = 


0 

Axj 

0 


ith row 


A Xj is a small perturbation in that can be taken as 0.1 to 2.0 % of x*. Evaluate 


/(X+); /(Xr ); gj QL+) 


If 


gy(X, ), j = 1, 2, . . . , m for i = 1, 2, . . . , n 


/(X+)>/(X*); g 7 (X+) <0, j — 1,2, ... ,m 

/(Xp > /(X*); gj (X~)< 0, j — 1,2, ... ,m 

for i = 1, 2, . . . , n, X* can be taken as the constrained optimum point of the original 
problem. 


7.21.2 Testing the Kuhn-Tucker Conditions 

Since the Kuhn-Tucker conditions, Eqs. (2.73) and (2.74), are necessarily to be sat- 
isfied 1 by the optimum point of any nonlinear programming problem, we can at least 


+ These may not be sufficient to guarantee a global minimum point for nonconvex programming problems. 
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test for the satisfaction of these conditions before taking a point X as optimum. 
Equations (2.73) can be written as 


\ " , ^8j_ _ df 

“ 7 dxj dxi ’ 
j^jl 


i — 1,2, ... ,n 


(7.261) 


where J\ indicates the set of active constraints at the point X. If g ; i(X) = gjoQ^) — 
■ ■ ■ — g j p (X ) = 0, Eqs. (7.261) can be expressed as 


where 


G A. = F 

nxp px 1 hxI 


~ d Sj l 

d 8j2 

9xi 

9xi 

d8ji 

d 8j2 

9x 2 

9x 2 

d 8ji 

d 8j2 

_ 9x„ 

9x„ 


A, = 


Xji 

X j2 

^ jp 


and 


.. dg jP 

dxi 

d 8j P 

9x 2 


.. dg jp 

dx„ _ 


X 


F 


df_ 

dxi 

df_ 

9x 2 


df_ 

9x„ 


X 


From Eqs. (7.262) we can obtain an expression for A as 


A = (G T G)“ 1 G r F 


(7.262) 


(7.263) 


If all the components of A, given by Eq. (7.263) are positive, the Kuhn-Tucker 
conditions will be satisfied. A major difficulty in applying Eq. (7.263) arises from the 
fact that it is very difficult to ascertain which constraints are active at the point X. 
Since no constraint will have exactly the value of 0.0 at the point X while working 
on the computer, we have to take a constraint gj to be active whenever it satisifes the 
relation 


l*j(X)| <e 


(7.264) 


where s is a small number on the order of 10 2 to 10 6 . Notice that Eq. (7.264) 
assumes that the constraints were originally normalized. 
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7.22 TEST PROBLEMS 


As discussed in previous sections, a number of algorithms are available for solving 
a constrained nonlinear programming problem. In recent years, a variety of computer 
programs have been developed to solve engineering optimization problems. Many of 
these are complex and versatile and the user needs a good understanding of the algo- 
rithms/computer programs to be able to use them effectively. Before solving a new 
engineering design optimization problem, we usually test the behavior and conver- 
gence of the algorithm/computer program on simple test problems. Five test problems 
are given in this section. All these problems have appeared in the optimization literature 
and most of them have been solved using different techniques. 


The optimal design of the three-bar truss shown in Fig. 7.21 is considered using two 
different objectives with the cross-sectional areas of members 1 (and 3) and 2 as design 
variables [7.38]. 

Design vector: 



Objective functions: 


7.22.1 Design of a Three-Bar Truss 


f\ (X) = weight = 2s/lx\ + xj 


/ 2 (X) = vertical deflection of loaded joint = 


PH 1 


E X\ T - \/2x2 


Constraints: 


<ti(X) — er ( “ } < 0 
0 - 2 (X) -<r (u) < 0 
03 (X) — cr (/) < 0 

x j l] < x i < x j“\ i — 1,2 



P 


Figure 7.21 Three-bar truss [7.38], 
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where er ( - is the stress induced in member i, cr (u) the maximum permissible stress in 
tension, ct (/) the maximum permissible stress in compression, xf 1 the lower bound 
on Xj, and x- u) the upper bound on x, . The stresses are given by 


ffi(X) = P 
o- 2 (X) = P 


cr 3 ( X ) = -P 


%2 + \[2x\ 
\f2x\ + 2 X\X 2 

1 

X\ + sflX2 

X2 


\flx\ + 2 xiX 2 


Data: a (u) = 20, ct (,) = -15, x, a) = 0.1 (i = 1,2), xf = 5.0(1 = 1,2), P = 20, 
and E = 1. 

Optimum design: 


y *_ {0.78706 
A 1 — [0.40735 



* _ 2.6335, stress constraint of 
J 1 — member 1 is active at X * 

f* = 1.6569 


7.22.2 Design of a Twenty-Five-Bar Space Truss 

The 25-bar space truss shown in Fig. 7.22 is required to support the two load condi- 
tions given in Table 7.7 and is to be designed with constraints on member stresses as 
well as Euler buckling [7.38]. A minimum allowable area is specified for each mem- 
ber. The allowable stresses for all members are specified as rr max in both tension and 
compression. The Young’s modulus and the material density are taken as E = 10 7 psi 
and p = 0.1 lb/in 3 . The members are assumed to be tubular with a nominal diame- 
ter/thickness ratio of 100, so that the budding stress in member i becomes 


Pi = ~ 


100.0 InEAi 


8 It 


1 = 1,2 25 


where Aj and /, denote the cross-sectional area and length, respectively, of member i. 
The member areas are linked as follows: 


A i, A2 — A3 = A4 — A 5, Ag = Af = Ag = A9, 

A10 = An, A 12 = A13, A14 = A15 = A [6 = A 17 , 

Al8 = A 19 = A20 = A21, A22 = A23 = A24 = A25 

Thus there are eight independent area design variables in the problem. Three problems 
are solved using different objective functions: 

25 

/1 (X) = Y^pAiU = weight 

i = 1 

/ 2 (X) = (Sj x + S 2 ly + 8\ z ) 1 / 2 + (Si + S 2 y + Si) 1 / 2 


7.22 
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z 



— sum of deflections of nodes 1 and 2 
/ 3 (X) = —co i = negative of fundamental natural frequency of vibration 
where 8 ix — deflection of node i along x direction. 


Table 7.7 Loads Acting on the 25-Bar Truss 




Joint 




1 

2 

3 

6 

F x 

0 

Load condition 1, loads in pounds 
0 

0 

0 

Fy 

20,000 

-20,000 

0 

0 

F z 

-5,000 

-5,000 

0 

0 

F x 

1,000 

Load condition 2, loads in pounds 
0 

500 

500 

Fy 

10,000 

10,000 

0 

0 

F z 

-5,000 

-5,000 

0 

0 
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Constraints: 

|o7/(X)| < cr max , i — 1,2, , 25, 7 = 1,2 

0ij(X) < Pi (X), i = 1,2,..., 25, 7 = 1,2 

< Xj < Xj U \ i = 1, 2, . . . , 8 

where ct, ; is the stress induced in member i under load condition j, x ( / ] the lower 
bound on x,, and x-“ ' the upper bound on x, . 

Data: cr max = 40,000 psi, x- l} = 0.1 in 2 , x- u) — 5.0 in 2 for i = 1,2, , 25. 
Optimum solution: See Table 7.8. 


7.22.3 Welded Beam Design 

The welded beam shown in Fig. 7.23 is designed for minimum cost subject to con- 
straints on shear stress in weld (r), bending stress in the beam (er), buckling load on 
the bar (P c ), end deflection of the beam (5), and side constraints [7.39]. 

Design vector: 


X\ 


h 

X 2 


1 

X 3 


t 

X 4 


b 


Table 7.8 Optimization Results of the 25-Bar Truss [7.38] 




Optimization problem 


Quantity 

Minimization 
of weight 

Minimization 
of deflection 

Maximization 
of frequency 

Design vector, X 

0.1“ 

3.7931 

0.1“ 


0.80228 

5.0“ 

0.79769 


0.74789 

5.0“ 

0.74605 


0.1“ 

3.3183 

0.72817 


0.12452 

5.0“ 

0.84836 


0.57117 

5.0“ 

1.9944 


0.97851 

5.0“ 

1.9176 


0.80247 

5.0“ 

4.1119 

Weight (lb) 

233.07265 

1619.3258 

600.87891 

Deflection (in.) 

1.924989 

0.30834 

1.35503 

Fundamental frequency (Hz) 

73.25348 

70.2082 

108.6224 

Number of active behavior 
constraints 

9 b 

0 

4 c 


“Active side constraint. 

^Buckling stress in members, 2, 5, 7, 8, 19, and 20 in load condition 1 and in members 13, 16, and 24 in 
load condition 2. 

“Buckling stress in members 2, 5, 7, and 8 in load condition 1. 
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Objective function: /(X) = 1. 10471x 2 X2 + 0.0481 1 x 3 x 4 ( 14.0 + xj) 
Constraints: 


where 


gi(X ) = t(X) - r max < 0 
g 2 (X) = er(X) - cr max < 0 
g 3 (X) = xi - x 4 < 0 

g 4 (X) = 0.10471jcf + 0.04811x 3 x 4 (14.0 + x 2 ) -5.0 < 0 
ft(X) = 0.125 -X! <0 
g 6 (X) = <5(X)-S max <0 

g 7 (X) = P- PAX) < 0 

gs(X) to gii(X) : 0.1 < Xi < 2.0, i — 1,4 
gi 2 (X) to g'is(X) : 0.1 <Xi < 10.0, i = 2,3 


t(X) = J (r ') 2 + 2 t't "jA + ( T ")2 
MR 


X = 


V2x\X2 


„ MK ( X2\ 

• 1 =— " = p ( i + f) 


R = 


+ 


Xi + x 3 
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X1X2 


x 2 

12 


X\ +X3 


ff(X) = 


<5(X) = 


6PL 

X4X3 

4PL 3 


PAX) = 


EX3X4 

4.013-J EG(x%x%/36) 


L 2 



Data: P = 60001b, L — 14 in., E — 30 x 10 6 psi, G — 12 x 10 6 psi, r max = 
13,600 psi, er max = 30,000 psi, and 5 max = 0.25 in. 

Starting and optimum solutions: 


y start 


h 


0.4 


h 

* 

0.2444' 

l 

t 

' = - 

6.0 

9.0 

in., / start = $5.3904, X* = 

/ 

t 

■ = - 

6.2177 

8.2915 

b 


0.5 


b 


0.2444 


/* = $2.3810 


7.22.4 Speed Reducer (Gear Train) Design 

The design of the speed reducer, shown in Fig. 7.24, is considered with the face width 
(b), module of teeth On), number of teeth on pinion (z), length of shaft 1 between bear- 
ings (/ 1 ), length of shaft 2 between bearings (Z 2 ), diameter of shaft 1 (d\ ), and diameter 
of shaft 2 (t/ 2 ) as design variables x\, X 2 , . . . , xj, respectively. The constraints include 
limitations on the bending stress of gear teeth, surface stress, transverse deflections of 
shafts 1 and 2 due to transmitted force, and stresses in shafts 1 and 2 [7.40, 7.41]. 



Figure 7.24 Speed reducer (gear pair) [7.40]. 
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Objective (minimization of weight of speed reducer): 

/(X) = 0.7854jcucf(3.3333jcf + 14.9334*3 -43.0934) - 1 .508xj (jcg + x v 2 ) 
+ 1 All (x\ + *|) + 0.7854(x 4 Xg + jc 5 *f) 


Constraints: 


gi(x ) =21 x^ x xj 2 xj l < 1 
g 2 (x) =397.5xj“ 1 ^ 2 ^ 2 < 1 
g 3 (x) =1 .93xj l xj 1 x\x^ 4 < 1 
g 4 (x) =1.93x7 1 x^“ 1 x|x 7 -4 < 1 


g5(*) 

ge(x) 



+ (16.9)10 6 


+ (157.5)10° 


A 

7 ° 


1*6 


lx 


< 1100 

7 3 < 850 


g 7 (x) =x 2 x 3 < 40 


gs(x) : 5 < — < 12 : g 9 (x) 
X2 

gio(x) : 2.6 < xi < 3.6 : gu(x) 
gn(x) : 0.7 < x 2 < 0.8 : g [3 (*) 
gi 4 (x) : 17 < x 3 < 28 : gi 5 (x) 
gie(x) : 7.3 < x 4 < 8.3 : gi 7 (x) 
gi 8 (x) : 7.3 < x 5 < 8.3 : g i9 (x) 
g 2 o(x) : 2.9 < x 6 < 3.9 : g 2 i(x) 
g 22 (x) : 5.0 < x 7 < 5.5 : g 23 (x) 
g 24 (x) = (1.5x 6 + 1.9)x 4 ) 1 < 1 
825 (*) = (1.1*7 + 1-9)*5 _1 < 1 


Optimum solution: 

X* — {3.5 0.7 17.0 7.3 7.3 3.35 5.29} T , /* = 2985.22 


7.22.5 Heat Exchanger Design [7.42] 

Objective junction: Minimize /(X) = X\ + x 2 + x 3 
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Constraints: 

gi(X) = 0.0025(x 4 + x 6 ) - 1 < 0 

g 2 (X) = 0.0025 (—.*4 + X5 + xj) — 1 < 0 

g 3 (X) = 0.01(— x 5 + x 8 ) - 1 <0 

g 4 (X) = lOOxi - x lX6 + 833.33252x 4 - 83,333.333 < 0 

g 5 (X) = X2X 4 — X2X7 — 1250x 4 + 1250x5 < 0 

ge(X) = X3X5 — X3X8 — 2500x5 + 1,250,000 < 0 

gi : 100 < xi < 10,000 : g 8 

g 9 : 1000 < x 2 < 10,000 : g 10 

gn : 1000 < x 3 < 10,000 : gi2 

gi 3 to g22 : 10 < Xi < 1000, i — 4, 5, . . . , 8 

Optimum solution: X* = {567 1357 5125 181 295 219 286 395} T , 

f* = 7049 


7.23 MATLAB SOLUTION OF CONSTRAINED OPTIMIZATION 
PROBLEMS 

The solution of multivariable minimization problems, with inequality and equality con- 
straints, using the MATLAB function fmincon is illustrated in this section. 

Example 7.13 Find the solution of Example 7.8 starting from the initial point 
Xi = { 0.1 0.1 3 . 0 } t 

SOLUTION 

Step 1: Write an M-hle objfun.m for the objective function. 

function f= objfun (x) 

f= x (1) A 3-6*x (1) "2 + ll*x (1) +x (3) ; 

Step 2: Write an M-file constraints .m for the constraints. 

function [c, ceq] = constraints (x) 

% Nonlinear inequality constraints 

c = [x (1) A 2+x (2) A 2-x (3) A 2; 4-x (1) A 2-x (2) A 2-x (3) A 2;x (3) -5; 

. — x ( 1 ) ; — x ( 2 ) ; x ( 3 ) ] ; 

% Nonlinear equality constraints 
ceq = [ ] ; 
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Step 3: Invoke constrained optimization program (write this in new MATLAB file). 

clc 

clear all 
warning off 

xO = [ . 1 , . 1 , 3.0]; % Starting guess 

fprintf ('The values of function value and constraints at 
starting pointn ' ) ; 
f=objfun (xO) 

[c, ceq] = constraints (xO) 

options = optimset ( ' LargeScale ' , 'off'); 

[x, fval ] =fmincon (Sobjfun, xO, [], [], [], [], [], [], 

0constraints, options) 

fprintf ('The values of constraints at optimum solutionn'); 
[c, ceq] = constraints (x) 

% Check the constraint values at x 

This Produces the Solution or Ouput as follows: 

The values of function value and constraints at 
starting point 
f= 

4.0410 

c= 

-8.9800 
-5.0200 
-2 . 0000 
-0 . 1000 
-0 . 1000 
-3.0000 
ceq = 

[] 

Optimization terminated: first-order optimality measure 
less 

than options. TolFun and maximum constraint violation is 
less 

than options . TolCon . 

Active inequalities (to within options . TolCon = le-006) : 
lower upper ineqlin ineqnonlin 
1 
2 
4 

x= 

0 1.4142 1.4142 
fval = 

1.4142 

The values of constraints at optimum solution 
c= 
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- 0.0000 

- 0.0000 

- 3.5858 

0 

- 1.4142 
- 1.4142 
ceq = 

[] 
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REVIEW QUESTIONS 

7.1 Answer true or false: 

(a) The complex method is similar to the simplex method. 

(b) The optimum solution of a constrained problem can be the same as the unconstrained 
optimum. 

(c) The constraints can introduce local minima in the feasible space. 

(d) The complex method can handle both equality and inequality constraints. 

(e) The complex method can be used to solve both convex and nonconvex problems. 

(f) The number of inequality constraints cannot exceed the number of design variables. 

(g) The complex method requires a feasible starting point. 
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(h) The solutions of all LP problems in the SLP method lie in the infeasible domain of 
the original problem. 

(i) The SLP method is applicable to both convex and nonconvex problems. 

(j ) The usable feasible directions can be generated using random numbers. 

(k) The usable feasible direction makes an obtuse angle with the gradients of all the 
constraints. 

(l) If the starting point is feasible, all subsequent unconstrained minima will be feasible 
in the exterior penalty function method. 

(m) The interior penalty function method can be used to find a feasible starting point. 

(n) The penalty parameter approaches zero as k approaches infinity in the exterior 
penalty function method. 

(o) The design vector found through extrapolation can be used as a starting point for the 
next unconstrained minimization in the interior penalty function method. 

7.2 Why is the SLP method called the cutting plane method? 

7.3 How is the direction-finding problem solved in Zoutendijk’s method? 

7.4 What is SUMT? 

7.5 How is a parametric constraint handled in the interior penalty function method? 

7.6 How can you identify an active constraint during numerical optimization? 

7.7 Formulate the equivalent unconstrained objective function that can be used in random 
search methods. 

7.8 How is the perturbation method used as a convergence check? 

7.9 How can you compute Lagrange multipliers during numerical optimization? 

7.10 What is the use of extrapolating the objective function in the penalty function approach? 

7.11 Why is handling of equality constraints difficult in the penalty function methods? 

7.12 What is the geometric interpretation of the reduced gradient? 

7.13 Is the generalized reduced gradient zero at the optimum solution? 

7.14 What is the relation between the sequential quadratic programming method and the 
Lagrangian function? 

7.15 Approximate the nonlinear function /(X) as a linear function at X 0 . 

7.16 What is the limitation of the linear extended penalty function? 

7.17 What is the difference between the interior and extended interior penalty function 
methods? 

7.18 What is the basic principle used in the augmented Lagrangian method? 

7.19 When can you use the steepest descent direction as a usable feasible direction in 
Zoutendijk’s method? 

7.20 Construct the augmented Lagrangian function for a constrained optimization problem. 
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7.21 Construct the fa function to be used for a mixed equality-inequality constrained problem 
in the interior penalty function approach. 

7.22 What is a parametric constraint? 

7.23 Match the following methods: 


(a) 

Zoutendijk method 

Heuristic method 

(b) 

Cutting plane method 

Barrier method 

(c) 

Complex method 

Feasible directions method 

(d) 

Projected Lagrangian method 

Sequential linear programming method 

(e) 

Penalty function method 

Gradient projection method 

(f) 

Rosen’s method 

Sequential unconstrained minimization method 

(g) 

Interior penalty function method 

Sequential quadratic programming method 


7.24 Answer true or false: 

(a) The Rosen’s gradient projection method is a method of feasible directions. 

(b) The starting vector can be infeasible in Rosen’s gradient projection method. 

(c) The transformation methods seek to convert a constrained problem into an uncon- 
strained one. 

(d) The (pk function is defined over the entire design space in the interior penalty function 
method. 

(e) The sequence of unconstrained minima generated by the interior penalty function 
method lies in the feasible space. 

(f) The sequence of unconstrained minima generated by the exterior penalty function 
method lies in the feasible space. 

(g) The random search methods are applicable to convex and nonconvex optimization 
problems. 

(h) The GRG method is related to the method of elimination of variables. 

(i) The sequential quadratic programming method can handle only equality constraints. 

(j) The augmented Lagrangian method is based on the concepts of penalty function and 
Lagrange multiplier methods. 

(k) The starting vector can be infeasible in the augmented Lagrangiam method. 

PROBLEMS 

7.1 Find the solution of the problem: 

Minimize /(X) = x\ + 2x\ — 2 * 1*2 — 14*i — 14*2 + 10 

subject to 

4*i + *f - 25 < 0 

using a graphical procedure. 

7.2 Generate four feasible design vectors to the welded beam design problem (Section 7.22.3) 

using random numbers. 


7.3 Generate four feasible design vectors to the three-bar truss design problem (Section 7.22.1) 
using random numbers. 
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7.4 Consider the tubular column described in Example 1.1. Starting from the design vector 
(d = 8.0 cm, t = 0.4 cm), complete two steps of reflection, expansion, and/or contraction 
of the complex method. 

7.5 Consider the problem: 


Minimize /(X) = jci — x 2 


subject to 


3xf — 2 jciX 2 + xf — 1 < 0 


(a) Generate the approximating LP problem at the vector, X i = { ,} . 

(b) Solve the approximating LP problem using graphical method and find whether the 
resulting solution is feasible to the original problem. 

7.6 Approximate the following optimization problem as (a) a quadratic programming problem, 
and (b) a linear programming problem at X = { _ * } . 

Minimize / (X) = 2*i + 15xf — 8*1*2 + 15 


subject to 


*i + *1*2 + 1=0 
4*i — xf < 4 


7.7 The problem of minimum volume design subject to stress constraints of the three-bar 
truss shown in Fig. 7.21 can be stated as follows: 

Minimize /(X) = 282. 8*1 + 100.0*2 


subject to 


a 1 — o"o 


— Ob — Oq 


20(*2 + x/2*i) 

2*1*2 + -\/ 2 xf 
20*2 

2*1*2 + V2xf 


— 20 < 0 
— 20 < 0 


0 < *,- < 0.3, 1 = 1,2 


where cr,- is the stress induced in member i, op = 20 the permissible stress, *1 the area 
of cross section of members 1 and 3, and *2 the area of cross section of member 2. 
Approximate the problem as a LP problem at (*1 = 1, * 2 = 1). 

7.8 Minimize /(X) = x\ + xf — 6*1 — 8*2 + 10 


subject to 

4xf + *| < 16 
3*i + 5*2 < 15 

X; > 0, ( = 1,2 

with the starting point X] = {j}. Using the cutting plane method, complete one step of 
the process. 
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7.9 Minimize /(X) = 9* 2 + 6x 2 + *| — 18*i — 12*2 — 6*3 — 8 
subject to 

x\ + 2x2 + *3 < 4 

Xi >0, i = 1,2,3 

Using the starting point X 1 = {0, 0, 0} T , complete one step of sequential linear program- 
ming method. 

7.10 Complete one cycle of the sequential linear programming method for the truss of 
Section 7.22.1 using the starting point, Xi = {[}. 

7.11 A flywheel is a large mass that can store energy during coasting of an engine and feed 
it back to the drive when required. A solid disk-type flywheel is to be designed for an 
engine to store maximum possible energy with the following specifications: maximum 
permissible weight = 1501b, maximum permissible diameter (d) = 25 in., maximum 
rotational speed = 3000 rpm, maximum allowable stress (er ma x) = 20,000 psi, unit weight 
(y) = 0.283 lb/in 3 , and Poisson’s ratio (v) = 0.3. The energy stored in the flywheel is 
given by \la > 2 , where I is the mass moment of inertia and a> is the angular velocity, and 
the maximum tangential and radial stresses developed in the flywheel are given by 


y(3 + v)co 2 d 2 



where g is the acceleration due to gravity and d the diameter of the flywheel. The distortion 
energy theory of failure is to be used, which leads to the stress constraint 

a 2 + T~ - Of Or < cr- ax 


Considering the diameter (d) and the width (w) as design variables, formulate the opti- 
mization problem. Starting from (d = 15 in., w = 2 in.), complete one iteration of the 
SLP method. 


7.12 


Derive the necessary conditions of optimality and find the solution for the following 
problem: 

Minimize /(X) = 5 * 1*2 


subject to 

25 - x\ - x\ > 0 


7.13 Consider the following problem: 

Minimize / = (* 1 — 5) 2 + (*2 — 5) 2 


subject to 


*1 + 2*2 < 15 

1 < *; < 10 , 1 = 1,2 


Derive the conditions to be satisfied at the point X = {’} by the search direction S = {'' [ 
if it is to be a usable feasible direction. 
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7.14 Consider the problem: 

Minimize / = (xi — l) 2 + (xj,— 5) 2 

subject to 

gi = ~x\ +x 2 -4 < 0 
g 2 = (xi - 2) 2 + x 2 - 3 < 0 

Formulate the direction-finding problem at X,- = { — ^ } as a linear programming problem 
(in Zoutendijk method). 

7.15 Minimize /(X) = (xi — l) 2 + (x 2 — 5) 2 
subject to 

—X 2 + X2 < 4 
— (xi — T)~ 4- X 2 < 3 

starting from the point X i = { [ } and using Zoutendijk’ s method. Complete two 
one-dimensional minimization steps. 

7.16 Minimize / (X) = (x\ — l) 2 + (x 2 — 2) 2 — 4 
subject to 

xi + 2x2 < 5 
4xi + 3x2 < 10 
6xi + X 2 < 7 

Xi > 0 , 1 = 1.2 


by using Zoutendijk’ s method from the starting point X i = { j } - Perform two 
one-dimensional minimization steps of the process. 

7.17 Complete one iteration of Rosen’s gradient projection method for the following problem: 

Minimize / = (x i — 1 ) 2 + (x 2 — 2) 2 — 4 


subject to 


x\ + 2 x 2 < 5 
4xi + 3x2 < 10 
6xi + X 2 < 7 

Xi > 0 , 1 = 1,2 


Use the starting point, X| = {[}. 

7.18 Complete one iteration of the GRG method for the problem: 

Minimize / = x 2 + xf 


subject to 


X\X2 —9 = 0 


starting from X i = 


2.0 

4.5 
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7.19 


Approximate the following problem as a quadratic programming problem at (xi = 1, 
x 2 = 1 ): 

Minimize / = x 2 + x? — 6x\ — 8 x 2 + 15 


subject to 

4x\ + x\ < 16 
3xf + 5x% < 15 

Xj > 0, t = 1,2 


7.20 Consider the truss structure shown in Fig. 7.25. The minimum weight design of the truss 
subject to a constraint on the deflection of node S along with lower bounds on the cross 
sectional areas of members can be started as follows: 


Minimize / = 0.1 847* 1 + 0.1306x2 


subject to 

26.1546 30.1546 

+ < 1.0 

x\ x 2 

X; > 25 mm 2 , ( = 1,2 

Complete one iteration of sequential quadratic programming method for this problem. 

7.21 Find the dimensions of a rectangular prism type parcel that has the largest volume when 
each of its sides is limited to 42 in. and its depth plus girth is restricted to a maximum 
value of 72 in. Solve the problem as an unconstrained minimization problem using suitable 
transformations. 

7.22 Transform the following constrained problem into an equivalent unconstrained problem: 


Maximize f(x 1 , X 2 ) = [9 — (xi — 3) 2 ] 


27 V3 



Figure 7.25 Four-bar truss. 
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subject to 


0 < x\ 


0 <x 2 


< 


x\ 

V3 


0 < x\ + \Fbx 2 < 6 


7.23 Construct the fa function, according to (a) interior and (b) exterior penalty function 
methods and plot its contours for the following problem: 


Maximize f = 2x 


subject to 

2 < x < 10 

7.24 Construct the fa function according to the exterior penalty function approach and complete 
the minimization of fa for the following problem. 

Minimize f(x) = (x — l) 2 


subject to 


g l (x) = 2 - x < 0, g 2 {x) = x - 4 < 0 


7.25 Plot the contours of the fa function using the quadratic extended interior penalty function 
method for the following problem: 

Minimize f(x) = (x — l) 2 


subject to 


giCO = 2 — x < 0, gi(x) = x — 4 < 0 


7.26 Consider the problem: 

Minimize f(x) = x 1 — lO.r — 1 


subject to 

1 < x < 10 

Plot the contours of the fa function using the linear extended interior penalty function 
method. 

7.27 Consider the problem: 


Minimize f(x i, x 2 ) = (xi — l) 2 + (x 2 — 2) 2 


subject to 


2xi — X 2 = 0 and xi < 5 

Construct the fa function according to the interior penalty function approach and complete 
the minimization of fa. 
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7.28 Solve the following problem using an interior penalty function approach coupled with the 
calculus method of unconstrained minimization: 


Minimize / = x 2 — 2x — 1 


subject to 

1 - x > 0 


Note: Sequential minimization is not necessary. 

7.29 Consider the problem: 

Minimize / = x 2 + x\ — 6x\ — 8*2 + 15 

subject to 

4x 2 + x\ > 16, 3x\ + 5x2 < 15 


7.30 


Normalize the constraints and find a suitable value of r\ for use in the interior penalty 
function method at the starting point (x\ , X 2 ) = (0, 0). 

Determine whether the following optimization problem is convex, concave, or neither 
type: 

Minimize / = — 4xi + xf — 2 xiX 2 + 2x\ 


subject to 


2xi + X2 < 6, xi — 4X2 <0, X; > 0, i = l,2 


7.31 Find the solution of the following problem using an exterior penalty function method with 
classical method of unconstrained minimization: 

Minimize f(x 1, X2) = (2xi — X2) 2 + (X2 + l) 2 

subject to 

Xi + X 2 = 10 

Consider the limiting case as r* — »• 00 analytically. 

7.32 Minimize / = 3xf + 4x| subject to xi + 2x2 = 8 using an exterior penalty function 
method with the calculus method of unconstrained minimization. 

7.33 A beam of uniform rectangular cross section is to be cut from a log having a circular 
cross section of diameter 2a. The beam is to be used as a cantilever beam to carry a 
concentrated load at the free end. Find the cross-sectional dimensions of the beam which 
will have the maximum bending stress carrying capacity using an exterior penalty function 
approach with analytical unconstrained minimization. 

7.34 Consider the problem: 

Minimize / = j(xi + l) 3 +X2 

subject to 

1 — Xl < 0, X2 > 0 

The results obtained during the sequential minimization of this problem according to the 
exterior penalty function approach are given below: 
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Value of 
k 

n 

Starting point for 
minimization of 
0(X, r k ) 

Unconstrained 
minimum of 
0(X,r*) = X| 

>< 

II 

1 

i 

(-0.4597, -5.0) 

(0.2361, -0.5) 

0.1295 

2 

10 

(0.2361, -0.5) 

(0.8322, -0.05) 

2.0001 


Estimate the optimum solution, X* and /*, using a suitable extrapolation technique. 

7.35 The results obtained in an exterior penalty function method of solution for the optimization 
problem stated in Problem 7.15 are given below: 


n =0.01, 
r 2 = 1.0, 



- 0.80975 
-50.0 


(j)\ = -24.9650, 


0.23607 

-0.5 


= 0.9631. /,* 


f* = -49.9977 
= 0.1295 


Estimate the optimum design vector and optimum objective function using an extrapola- 
tion method. 


7.36 The following results have been obtained during an exterior penalty function approach: 


r, = 10 


-10 


xj = 


0.66 

28.6 


r 2 = 10“ 9 , X} 


1.57 

18.7 


Find the optimum solution, X*, using an extrapolation technique. 


7.37 The results obtained in a sequential unconstrained minimization technique (using an exte- 
rior penalty function approach) from the starting point Xi = | are 


n = 10 


,-io 


X* = 


| 0.66 


1.57 

1 28.6 

; r 2 = 10 , XJ = 

18.7 


r 3 = 10“ 8 , 


1.86 

18.8 


Estimate the optimum solution using a suitable extrapolation technique. 

7.38 The two-bar truss shown in Fig. 7.26 is acted on by a varying load whose magnitude 
is given by P(0) = Pocos26; 0° < 6 < 360°. The bars have a tubular section with 
mean diameter d and wall thickness t. Using Pq = 50,0001b, ffyieid = 30,000 psi, and 
E = 30 x 10 6 psi, formulate the problem as a parametric optimization problem for min- 
imum volume design subject to buckling and yielding constraints. Assume the bars to be 
pin connected for the purpose of buckling analysis. Indicate the procedure that can be 
used for a graphical solution of the problem. 

7.39 Minimize /(X) = (x\ — l) 2 + (x 2 — 2) 2 
subject to 

x\ + 2x 2 —2 = 0 


using the augmented Lagrange multiplier method with a fixed value of r p = 1. Use a 
maximum of three iterations. 


488 Nonlinear Programming III: Constrained Optimization Techniques 



Figure 7.26 Two-bar truss subjected to a parametric load. 


7.40 Solve the following optimization problem using the augmented Lagrange multiplier 
method keeping r p = 1 throughout the iterative process and A (1) = 0: 

Minimize / = (x\ — l) 2 + ( X 2 — 2) 2 


subject to 


— xi +2x2=2 


7.41 Consider the problem: 


Minimize / = (xi — l) 2 + (x 2 — 5) 2 


subject to 


jci + JC 2 — 5 = 0 


(a) Write the expression for the augmented Lagrange function with r p = 1. 

(b) Start with A*, 1 ’ = 0 and perform two iterations. 

(c) Find Xj 3) . 

7.42 Consider the optimization problem: 

Minimize / = jc 2 — 6.r 2 + 1 lxi + *3 

subject to 

x i + x 2 ~ x 3 — 0 ’ 4 — x\ — x\ — x\ < 0, X3 < 5, 


Xi >0, i = 1,2,3 


Problems 
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Determine whether the solution 

\ ° 

X = \ s/2 ■ 

lV2 

is optimum by finding the values of the Lagrange multipliers. 

7.43 Determine whether the solution 

\ ° 

X = ] s/2 ■ 

\V2 

is optimum for the problem considered in Example 7.8 using a perturbation method with 
A Xi =0.001, i = 1,2,3. 

7.44 The following results are obtained during the minimization of 

/ (X) = 9 — 8*i — 6*2 — 4*3 + 2x\ + 2*| + *| + 2*1*2 + 2*1*3 

subject to 

*1 + *2 + 2*3 < 3 
*; > 0, i = 1,2,3 


using the interior penalty function method: 


Value of r,- 


1 


0.01 


0.0001 


Starting point for 
minimization of 


0(X, n) 


Unconstrained minimum 
of 0(X, Ti) = X* 



1 0.8884] 
0.7188 1 
0.7260 J 


1 1.3313] 
0.7539 1 
0.3710 J 


1 0.8884] 
0.7188 } 
0.7260 J 

1 1.3313 
0.7539 
0.3710 

1 1.3478 
0.7720 
0.4293 


/(X*) 


ft 


0.7072 


0.1564 


0.1158 


Use an extrapolation technique to predict the optimum solution of the-problem using the 
following relations: 

(a) X(r) =A 0 + rAi; fir) =a 0 + rai 

(b) X(r) =A 0 +r 1 / 2 Ai;/(r) = fl0 + r 1 / 2 fli 
Compare your results with the exact solution 


X* = 


12 

9 

7 

9 


4 


l 

9 


9 
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7.45 Find the extrapolated solution of Problem 7.44 by using quadratic relations for X(r) and 

fir). 

7.46 Give a proof for the convergence of exterior penalty function method. 

7.47 Write a computer program to implement the interior penalty function method with 
the DFP method of unconstrained minimization and the cubic interpolation method of 
one-dimensional search. 

7.48 Write a computer program to implement the exterior penalty function method with 
the BFGS method of unconstrained minimization and the direct root method of 
one-dimensional search. 

7.49 Write a computer program to implement the augmented Lagrange multiplier method with 
a suitable method of unconstrained minimization. 

7.50 Write a computer program to implement the sequential linear programming method. 

7.51 Find the solution of the welded beam design problem formulated in Section 7.22.3 using 
the MATLAB function fmincon with the starting point X i = {0.4, 6.0, 9.0, 0.5} T 

7.52 Find the solution of the following problem (known as Rosen-Suzuki problem) using the 
MATLAB function fmincon with the starting point Xi = {0, 0, 0, 0} T : 

Minimize 

/(X) = x\ + xf 4- 2xf — xf — 5*i — 5^2 — 21 x 3 + 7 x 4 + 100 

subject to 

X[ + xf + x| + x| + Xl — X2 + X3 — X4 — 100 < 0 
xf + 2 xf + xf + 2 x 4 — xi — X 4 — 10 < 0 
2xf + xf + xf + 2xi — X 2 — X 4 — 5 < 0 
- 100 < Xi < 100, i = 1,2, 3,4 

7.53 Find the solution of the following problem using the MATLAB function fmincon with 
the starting point Xi = {0.5, 1.0} T : 

Minimize 

/(X) = xf + xf — 4xi — 6 x 2 

subject to 

X\ + X'2 < 2 
2 xi + 3 x 2 < 12 
X; > 0, 1 = 1,2 

7.54 Find the solution of the following problem using the MATLAB function fmincon with 
the starting point: Xi = {0.5, 1.0, 1.0}: 

Minimize /(X) = xf + 3xf + X 3 
subject to 

xf + xf + x\ = 16 


Problems 
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7.55 Find the solution of the following problem using the MATLAB function f mi neon with 
the starting point: X; = {1.0, 1.0} T : 

Minimize /' (X ) = xf + x\ 
subject to 

4 — x\ — x\ < 0 
3x‘2 — x\ < 0 
— 3.T2 — Xl < 0 
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Geometric Programming 


8.1 INTRODUCTION 


Geometric programming is a relatively new method of solving a class of nonlinear 
programming problems. It was developed by Dufbn, Peterson, and Zener [8.1]. It is 
used to minimize functions that are in the form of posynomials subject to constraints of 
the same type. It differs from other optimization techniques in the emphasis it places on 
the relative magnitudes of the terms of the objective function rather than the variables. 
Instead of finding optimal values of the design variables first, geometric programming 
first finds the optimal value of the objective function. This feature is especially advan- 
tageous in situations where the optimal value of the objective function may be all that 
is of interest. In such cases, calculation of the optimum design vectors can be omitted. 
Another advantage of geometric programming is that it often reduces a complicated 
optimization problem to one involving a set of simultaneous linear algebraic equations. 
The major disadvantage of the method is that it requires the objective function and 
the constraints in the form of posynomials. We will first see the general form of a 
posynomial. 


In an engineering design situation, frequently the objective function (e.g., the total cost) 
/(X) is given by the sum of several component costs t/,-(X) as 


In many cases, the component costs {/, can be expressed as power functions of the 
type 


where the coefficients c, are positive constants, the exponents a (/ are real constants 
(positive, zero, or negative), and the design parameters x\, X 2 , ■ ■ ■ , x n are taken to be 
positive variables. Functions like /, because of the positive coefficients and variables 
and real exponents, are called posynomials . For example, 


8.2 POSYNOMIAL 


/(X) = U 1 +U 2 + --- + U N 


( 8 . 1 ) 


Ui = Ci 



(8.2) 


fix 1 , X 2 , X-}) — 6 + 3xi — 8x2 + 7X3 + 2xiX2 

— 3xiX 3 + |x 2 X 3 + |xj — 9x| + x| 
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is a second-degree polynomial in the variables, xi,X 2 , and x 3 (coefficients of the various 
terms are real) while 

g(xi, x 2 , x 3 ) = *1X2*3 + x\x 2 + 4x 3 H h 5x“ 1/2 

xix 2 

is a posynomial. If the natural formulation of the optimization problem does not lead to 
posynomial functions, geometric programming techniques can still be applied to solve 
the problem by replacing the actual functions by a set of empirically fitted posynomials 
over a wide range of the parameters jq. 


8.3 UNCONSTRAINED MINIMIZATION PROBLEM 

Consider the unconstrained minimization problem: 


Find X = 


Xi 

*2 


that minimizes the objective function 

N N / r 


/(X) = J2 U i ( X) = I] c i n x 2 = 12 (c f *1 


fl i i “2j 


■ xT) 


(8.3) 


j = 1 


j = 1 


! = 1 


j= 1 


where cj > 0, Xj > 0, and the a (/ - are real constants. 

The solution of this problem can be obtained by various procedures. In the fol- 
lowing sections, two approaches — one based on the differential calculus and the other 
based on the concept of geometric inequality — are presented for the solution of the 
problem stated in Eq. (8.3). 


8.4 SOLUTION OF AN UNCONSTRAINED GEOMETRIC 
PROGRAMMING PROGRAM USING 
DIFFERENTIAL CALCULUS 


According to the differential calculus methods presented in Chapter 2, the necessary 
conditions for the minimum of / are given by 

df _y dUj 
dx k dx k 

N 

a\ j ci2j a^—ij afcj 

(Cj X 1 X 2 x k _ x a kj x k 

j = 1 



-1 


a k+l,j 

h+i 


Xn nJ ) = 0, 


k — 1,2 


(8.4) 
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By multiplying Eq. (8.4) by Xk, we can rewrite it as 


N 

E a\ ; Cl2j a k—l,j a kj a k+l,j a nj \ 

a k j ( c j *1 *2 •■■■**_! X k X k+l ---Xn ) 


df_ 

dx k 


.7 = 1 
N 


Y^ a kjUj(X) = 0, k=\,2,...,n 

7 = 1 


(8.5) 


To find the minimizing vector 


we have to solve the n equations given by Eqs. (8.4), simultaneously. To ensure that the 
point X * corresponds to the minimum of / (but not to the maximum or the stationary 
point of X), the sufficiency condition must be satisfied. This condition states that the 
Hessian matrix of / is evaluated at X*: 


J x* 


9 2 / 

dxk dX[ 


Jx* 


must be positive definite. We will see this condition at a latter stage. Since the vector 
X* satisfies Eqs. (8.5), we have 


N 

J2 a kjUj(X*) = 0, k — 1,2, ... ,n (8.6) 

7 = 1 

After dividing by the minimum value of the objective function /*, Eq. (8.6) becomes 

N 

&*a kj = 0, k = 1,2, ...,« (8.7) 

7 = 1 


where the quantities A* are defined as 



f* 


( 8 . 8 ) 


and denote the relative contribution of /th term to the optimal objective function. From 
Eq. (8.8), we obtain 


N 

£ 

7 = 1 


A* = A* 


- r iU! 


+ u * + ... + ut) = i 


(8.9) 
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Equations (8.7) are called the orthogonality conditions and Eq. (8.9) is called the 
normality condition. To obtain the minimum value of the objective function /*, the 
following procedure can be adopted. Consider 

/* = (/*)! = = (/*) A t (/*) A 2 ■ ■ ■ (/*) A v (8.10) 


Since 



from Eq. (8.8), Eq. (8.10) can be rewritten as 


(y_Y (uiY 1 
U x) U 2 / 


By substituting the relation 



U* = CjY\(x*) a <J, 

i=i 




Eq. (8.12) becomes 



/ „ \ Aj 

n 


/ ^ \ A* 

n 

A r 

r = 

G9 

n 

-i'=i 


■(£ ' 

n (x *) aa 

.1=1 



/ „ \ A* 

n 

<1 

(S) 

n ( x *) a,N 

_i = l 



since 



n 


j = 1 L(=l 


-1 At 


"[(ji ,*)*' 



n w) E? - 

_/=i 






N 

a,j A* = 0 for any i from Eq. (8.7) 

;= 1 


(8.11) 


(8.12) 


(8.13) 


Thus the optimal objective function f* can be found from Eq. (8.13) once A* are 
determined. To determine A* (7 = 1,2,..., N), Eqs. (8.7) and (8.9) can be used. It 
can be seen that there are n + 1 equations in N unknowns. If N = n + l, there will 
be as many linear simultaneous equations as there are unknowns and we can find a 
unique solution. 
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Degree Of Difficulty. The quantity N — n — 1 is termed a degree of difficulty in 
geometric programming. In the case of a constrained geometric programming problem, 
N denotes the total number of terms in all the posynomials and n represents the number 
of design variables. If N — n — 1 = 0, the problem is said to have a zero degree of 
difficulty. In this case, the unknowns A* (j = 1 , 2, . . . , N) can be determined uniquely 
from the orthogonality and normality conditions. If N is greater than n + 1, we have 
more number of variables (A*s) than the equations, and the method of solution for 
this case will be discussed in subsequent sections. It is to be noted that geometric 
programming is not applicable to problems with negative degree of difficulty. 

Sufficiency Condition. We can see that A* are found by solving Eqs. (8.7) and (8.9), 
which in turn are obtained by using the necessary conditions only. We can show that 
these conditions are also sufficient. 


Finding the Optimal Values of Design Variables. Since f* and A* (j = 

1, 2, . . . , N) are known, we can determine the optimal values of the design variables 
from the relations 

n 

U* = A*/* = Cj f] {x*) a 'i , j = 1,2,..., N (8.14) 

i=i 


The simultaneous solution of these equations will yield the desired quantities x* (i = 
1,2,..., n). It can be seen that Eqs. (8.14) are nonlinear in terms of the variables 
jc*, x|, . . . , x*, and hence their simultaneous solution is not easy if we want to solve 
them directly. To simplify the simultaneous solution of Eqs. (8. 14), we rewrite them as 


A *f* 

J J 


(xJT 1 ' (4) a2i ■ ■ ■ - j = 1,2,..., N 


(8.15) 


By taking logarithms on both the sides of Eqs. (8.15), we obtain 

A */* 

In — - — = aij In x* + ci 2 j In x| + ■ ■ ■ + a nj In x*, 
c j 


7 = 1,2 N 


(8.16) 


By letting 


Wi — In x* , i = 1 , 2, . . . , n 

Eqs. (8.16) can be written as 

f*A* 

a\\w\ +a 2 \W 2 H (- a„iw n =ln 

c 1 

f* a* 

a l2 wi + a 22 wi H h a n2 w n — In - 

c 2 


(8.17) 


(8.18) 


f*A* 

a\NW\ + a 2 NWi H b a nN w n = In — 

Cn 
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These equations, in the case of problems with a zero degree of difficulty, give a unique 
solution to w\, u> 2 , . . . , w n . Once Wj are found, the desired solution can be obtained as 

x* = e wi , i = l,2, (8.19) 

In a general geometric programming problem with a nonnegative degree of difficulty, 
N > n + 1, and hence Eqs. (8.18) denote N equations in n unknowns. By choosing 
any n linearly independent equations, we obtain a set of solutions u)j and hence x*. 

The solution of an unconstrained geometric programming problem is illustrated 
with the help of the following zero-degree-of-difficulty example [8.1]. 

Example 8.1 It has been decided to shift grain from a warehouse to a factory in an 
open rectangular box of length x\ meters, width x 2 meters, and height X 3 meters. The 
bottom, sides, and the ends of the box cost, respectively, $80, $10, and $20/m 2 . It costs 
$1 for each round trip of the box. Assuming that the box will have no salvage value, 
find the minimum cost of transporting 80 m 3 of grain. 


SOLUTION The total cost of transportation is given by 


total cost = cost of box + cost of transportation 

= (cost of sides + cost of bottom + cost of ends of the box) 

+ (number of round trips required for transporting the grain 
x cost of each round trip) 


/(X) = [(2xiX3)10 + (xix 2 )80 + (2x 2 X3)20] 


80 


.X 1 X 2 X 3 


-( 1 ) 


$ ( 8OX1X2 + 40 x 2 X 3 + 2OX1X3 H ] 

V X1X2X3/ 


(Ei) 


where xi,X 2, and X3 indicate the dimensions of the box, as shown in Fig. 8.1. By 
comparing Eq. (Ei) with the general posynomial of Eq. (8.1), we obtain 


ci = 80, ci — 40, C 3 = 20, C 4 = 80 


an 

£712 

£713 

au\ 


0 

1 - 

«21 

ail 

«23 

«24 = 

1 

1 

0 - 

fl31 

«32 

£(33 

£734/ 

\o 

1 

1 - 


The orthogonality and normality conditions are given by 


"1 0 1 -f 


Ai' 


0 

110-1 


A 2 


0 

0 11-1 


A 3 


0 

111 1 


. a A 


1 
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Figure 8.1 Open rectangular box. 


that is, 

Ai + A 3 — A 4 = 0 (E 2 ) 

Ai + A 2 — A 4 = 0 (E 3 ) 

A 2 + A 3 — A 4 = 0 (E 4 ) 

Ai + A 2 + A 3 + A 4 = 1 (E 5 ) 

From Eqs. (E?) and (E 3 ), we obtain 

A 4 = A] + A 3 = Ai + A 2 or A 2 = A 3 (Eg) 

Similarly, Eqs. (E 3 ) and (E 4 ) give us 

A 4 = Ai + A 2 = A 2 + A 3 or Ai = A 3 (E 7 ) 


Equations (Eg) and (E 7 ) yield 

Aj = A 2 = A 3 

while Eq. (Eg) gives 

A 4 = Aj + A 3 = 2Ai 

Finally, Eq. (Eg) leads to the unique solution 

Aj = AJ = A 3 = i and A* = § 

Thus the optimal value of the objective function can be found from Eq. (8.13) as 

^ / 8 ° \ 1/5 / 40 \ 1/5 / 20 \ 1/5 / 80 \ 2/5 

f =1175 J IT75J vw {2/5/ 

= (4 x 10 2 ) 1/5 (2 x 10 2 ) 1/5 (1 x 10 2 ) 1/5 (4 x 10 4 ) 1/5 
= (32 x 10 ‘°) 1/5 = $200 
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It can be seen that the minimum total cost has been obtained before finding the 
optimal size of the box. To find the optimal values of the design variables, let us write 
Eqs. ( 8 . 14 ) as 


U* = 80 x*x| = A If* = -( 200 ) = 40 
Ul = 40 x|x 3 * = A If* = -( 200 ) = 40 


U; = 20 jc*jc 3 = A If* = ^( 200 ) = 40 


Ut = 


80 


a i *^3 

From these equations, we obtain 

- , 1 1 


A %f* = -( 200 ) = 80 


*9 = 0 - 


*1 = 


Xo 


' i 




2x1 


*|=2 






Therefore, 


** = 1 m, *2 = 7 m , * 


3 = 2 m 


(Eg) 

(Eg) 

(Eio) 

(Eh) 


(E, 2 ) 


It is to be noticed that there is one redundant equation among Eqs. (Eg) to (En), which 
is not needed for the solution of x* (i — 1 to n). 

The solution given in Eq. (E12) can also be obtained using Eqs. ( 8 . 18 ). In the 
present case, Eqs. ( 8 . 18 ) lead to 


1 wi + 1 mi + 0u>3 = In ■ 


200 x i 


Ouii + 1 u>2 + 1 UJ3 = In 


1 u>i + 0u>2 + 1 W3 = In 


— lull — lu>2 — 1 U>3 = In ■ 


80 

200 x i 
40 

200 x ± 
20 

200 x § 
”80 


By adding Eqs. (E13), (E14), and (Eig), we obtain 


= In - 

2 

(E13) 

= In 1 

(E14) 

= In 2 

(E15) 

= In 1 

(Ei 6 ) 


W 2 = In j + In 1 + In 1 = ln(| • 1 • 1) = In j = ln*| 


or 


*T 


! 

2 


Similarly, by adding Eqs. (E13), (E15), and (Ei6), we get 

w\ — In ^ + In 2 + In 1 = In 1 = In ** 
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or 


xT = 1 


Finally, we can obtain Xj by adding Eqs. (E14), (E15), and (Eig) as 
W3 = In 1 + In 2 + In 1 = In 2 = In x% 


or 

x%=2 

It can be noticed that there are four equations, Eqs. (E13) to (Ei6) in three unknowns 
w 1, W 2 , and w 3. However, not all of them are linearly independent. In this case, the 
first three equations only are linearly independent, and the fourth equation, (Eig), can 
be obtained by adding Eqs. (E13), (E14), and (Ejg), and dividing the result by —2. 


8.5 SOLUTION OF AN UNCONSTRAINED GEOMETRIC 
PROGRAMMING PROBLEM USING 
ARITHMETIC-GEOMETRIC INEQUALITY 

The arithmetic mean- geometric mean inequality (also known as the arithmetic- 
geometric inequality or Cauchy’s inequality ) is given by [8.1] 

Aji/j T A2W2 T ■ ■ ■ T- A^yw^v > 1 2 • • • u (8.20) 

with 


A[ + A 2 + ■ ■ ■ + A;v — 1 


( 8 . 21 ) 


This inequality is found to be very useful in solving geometric programming problems. 
Using the inequality of (8.20), the objective function of Eq. (8.3) can be written as (by 
setting U{ = m, A, , i = 1 , 2, . . . , iV) 


U\ + U2 + ■ ■ ■ + Un > 





(8.22) 


where I/,- = C/, (X ) , i = 1, 2, . . . , N, and the weights Aj, A2 , . . . , Ajv, satisfy 
Eq. (8.21). The left-hand side of the inequality (8.22) [i.e., the original function /(X)] 
is called the primal function . The right side of inequality (8.22) is called the predual 
function. By using the known relations 


n 



i = l 


the predual function can be expressed as 

mr-or 




/ n \ Al / n \^2 / n \ 

/ ^ rr \ / ^ rr v fl « \ ‘ c N f\ * aiN ' 


ct n < i 

i = 1 


c 2 n *r 

1=1 

A2 


1=1 


1/V 


(8.23) 
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= &)'(£) 


^2 


C N 

a n 


a n 


Ai 


a 2 


rK‘ IK* 


V/— 1 


Vi'=l 


A jv 


n<“ 

Vi=t ) 

Ai 


&) ' te) 


EU a »J A j 


a 2 


Cn 

A n 


A n 


Eli«yAy\ / Ell ay Ay 


(8.24) 


If we select the weights A 7 - so as to satisfy the normalization condition, Eq. (8.21), 
and also the orthogonality relations 


N 


Eq. (8.24) reduces to 


Ui_'\ A ' (Ui\ A2 
Ai) \A 2 / 


y; ciij Aj — 0, i = 1, 2, . . . , n 
j = i 


A;vJ VAl/ \A 2 / V Aiv 


A jv 


Thus the inequality (8.22) becomes 


U i + (/ 2 + -- - + f/^ > 


. At / \ A? 

Cl \ 1 f C 2 A 


a 2 


) -fe) 


A iv 


(8.25) 


(8.26) 


(8.27) 


In this inequality, the right side is called the dual function , u ( A | , A 2 , . . . , A iV ) . The 
inequality (8.27) can be written simply as 


f >v 


(8.28) 


A basic result is that the maximum of the dual function equals the minimum of the 
primal function. Proof of this theorem is given in the next section. The theorem enables 
us to accomplish the optimization by minimizing the primal or by maximizing the dual, 
whichever is easier. Also, the maximization of the dual function subject to the orthog- 
onality and normality conditions is a sufficient condition for /, the primal function, to 
be a global minimum. 


8.6 PRIMAL-DUAL RELATIONSHIP AND SUFFICIENCY 
CONDITIONS IN THE UNCONSTRAINED CASE 

If f* indicates the minimum of the primal function and v* denotes the maximum of 
the dual function, Eq. (8.28) states that 

f>f*>v*>v 


(8.29) 
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In this section we prove that /* = v* and also that /* corresponds to the global 
minimum of /(X). For convenience of notation, let us denote the objective function 
/(X) by xq and make the exponential transformation 

e w ‘ = Xi or Wi —\nxi, i = 0, 1, 2, . . . , n (8.30) 


where the variables in, are unrestricted in sign. Define the new variables Ay, also 
termed weights, as 


Ui 

A ./ = - L = 

x 0 


ruf 


;=i 


x 0 


7 = 1 , 2 , 


, N 


which can be seen to be positive and satisfy the relation 


(8.31) 


N 


E A ; = ] 


(8.32) 


By taking logarithms on both sides of Eq. (8.31), we obtain 

n 

In A j — In Cj + ciij In .r ,■ — In xo (8.33) 

i=i 


or 

A n 

In — - = ^ 2 a ij w i — w o> j — 1, 2, . . . , N (8.34) 

c i «=i 

Thus the original problem of minimizing /(X) with no constraints can be replaced 
by one of minimizing wq subject to the equality constraints given by Eqs. (8.32) and 
(8.34). The objective function xo is given by 

N n 

xq — e w ° — 'Y2 c j J~[ e a ' jWt 
7=i /= i 


N 


— y, Cje^=' a >i wi 
7=1 


(8.35) 


Since the exponential function (e a ‘i Wl ) is convex with respect to Wj, the objective 
function xq, which is a positive combination of exponential functions, is also convex 
(see Problem 8.15). Hence there is only one stationary point for xq and it must be the 
global minimum. The global minimum point of wq can be obtained by constructing the 
following Lagrangian function and finding its stationary point: 

/ N 

L{ W, A, A.) = uiq + Xq I Aj — 1 
\;=i 


N 

+ EE' 


7=1 


y Clij Wi — Wq — In 
V 1=1 



(8.36) 
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where 



w 0 


Aj ' 


^0 

w = 

W 1 

A = 

A2 

A- = 

At 


U> n 


Ayy 




with A, denoting the vector of Lagrange multipliers. At the stationary point of L, we 
have 

3L 

=0, i = 0, 1 , 2, . . . , n 

dwj 

dL 

77- =0, j = 1,2,..., N (8.38) 

dAj 
3 L 

= 0, i =0, 1,2, ...,N 


These equations yield the following relations: 


N 

i — ^2 — o ° r 

7= i 


N 


E A / = i 


N 

Y, Xjdij — 0, i — 1,2, ... ,n 
7=1 

or A 0 = -^, j = 1,2,..., A 

A 7 A ./ 


N N 

E A ;- 1=0 or E A > = 1 

7 = 1 7=1 


— In — - + a ij u), — wq — 0, j = 1 , 2, . . . , N 

C J 7=1 

Equations (8.39), (8.41), and (8.42) give the relation 

N N N 

E^ = 1 = E E A > = 

7 = 1 7 = 1 7=1 

Thus the values of the Lagrange multipliers are given by 


(8.39) 

(8.40) 

(8.41) 

(8.42) 

(8.43) 

(8.44) 


X 


7 


1 , 


for j — 0 

for 7 = 1,2,...,# 


(8.45) 


504 


Geometric Programming 


By substituting Eq. (8.45) into Eq. (8.36), we obtain 


L( A, W) = - ^2 In — + (1 - w 0 ) ( ^2 A J - 1 ) + XI w ’ ( 


j = i 


G =1 


i=l 


G = > 


(8.46) 


The function given in Eq. (8.46) can be considered as the Lagrangian function cor- 
responding to a new optimization problem whose objective function v(A) is given by 


u(A) = — ^ A j In — - = In 

7 = 1 



and the constraints by 



= 0 


N 

Y_ cijj Aj — 0, i = 1 , 2, . . . , n 
7=1 


(8.47) 

(8.48) 

(8.49) 


This problem will be the dual for the original problem. The quantities (1 — wq), 
tui, W 2 , . . . , w„ can be regarded as the Lagrange multipliers for the constraints given 
by Eqs. (8.48) and (8.49). 

Now it is evident that the vector A which makes the Lagrangian of Eq. (8.46) 
stationary will automatically give a stationary point for that of Eq. (8.36). It can be 
proved that the function 

A; In — , j = 1,2, . . . , N 
c j 

is convex (see Problem 8.16) since A ; is positive. Since the function C(A) is given 
by the negative of a sum of convex functions, it will be a concave function. Hence 
the function C(A) will have a unique stationary point that will be its global maximum 
point. Hence the minimum of the original primal function is same as the maximum of 
the function given by Eq. (8.47) subject to the normality and orthogonality conditions 
given by Eqs. (8.48) and (8.49) with the variables A, constrained to be positive. 

By substituting the optimal solution A*, the optimal value of the objective function 
becomes 


v* = v(A*) = L(w*, A*) = = L ( w* , A*, V) 


N 


= ~ Y, A *i ln “ (8 - 50) 

7 = 1 Cj 

By taking the exponentials and using the transformation relation (8.30), we get 


r -m 


A* 

J 


(8.51) 
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Primal and Dual Problems. We saw that geometric programming treats the prob- 
lem of minimizing posynomials and maximizing product functions. The minimization 
problems are called primal programs and the maximization problems are called dual 
programs. Table 8.1 gives the primal and dual programs corresponding to an uncon- 
strained minimization problem. 

Computational Procedure. To solve a given unconstrained minimization problem, 
we construct the dual function u(A) and maximize either v ( A ) or I n v ( A ) , whichever 
is convenient, subject to the constraints given by Eqs. (8.48) and (8.49). If the degree 
of difficulty of the problem is zero, there will be a unique solution for the A*’s. 

For problems with degree of difficulty greater than zero, there will be more vari- 
ables A j (j = 1,2, , N ) than the number of equations ( n + 1). Sometimes it will 
be possible for us to express any (n + 1) number of Ay’s in terms of the remaining 
(N — n — 1) number of A ; ’s. In such cases, our problem will be to maximize u(A) or 
In u(A) with respect to the (N — n — l) independent Ay’s. This procedure is illustrated 
with the help of the following one-degree-of-difficulty example. 

E xample 8.2 In a certain reservoir pump installation, the first cost of the pipe is given 
by ( 1 00 D + 50 D 2 ), where D is the diameter of the pipe in centimeters. The cost of 
the reservoir decreases with an increase in the quantity of fluid handled and is given 
by 20/ <2, where Q is the rate at which the fluid is handled (cubic meters per second). 


Table 8.1 Primal and Dual Programs Corresponding to an Unconstrained 
Minimization Problem 


Primal program 


Dual program 



Xl 


A, 


X2 


At 

Find X = 


Find A = 



X n 


An 


so that so that 

/(X) = £ cjx^x^ ■ ■ - x y v(A) = n 



N 


minimum 


X\ > 0, X2 > 0, . . . , x„ > 0 


or 


In v(A) = In 



maximum 


(8.47) 


subject to the constraints 

E A, =l (8.48) 

i= i 
N 

E a ij Ay =0, i = 1, 2, . . . , n (8.49) 
]= i 
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The pumping cost is given by (300g 2 /.D 5 ). Find the optimal size of the pipe and the 
amount of fluid handled for minimum overall cost. 


SOLUTION 


f(D, Q ) = 100D 1 <2° + 50Z) 2 <2 0 + 20D°Q~ l +300 D~ 5 Q 2 


(Hi) 


Flere we can see that 

ci = 100, C 2 = 50, C 3 = 20, C 4 = 300 

( flu fli2 fli3 fli4\ /l 2 0 — 5\ 

«21 Cl22 fl23 a 2A J y0 0 — 1 2 J 

The orthogonality and normality conditions are given by 


0 0-1 



AL 

A 2 


0 

0 


• = . 


a 3 


1 


.a 4 . 



Since N > (n + 1), these equations do not yield the required Ay (j — I to 4) directly. 
But any three of the Ay’s can be expressed in terms of the remaining one. Flence by 
solving for Ai, A 2 , and A 3 in terms of A 4 , we obtain 

A, = 2 — 11A 4 

A 2 = 8 A 4 - 1 (E 2 ) 

A 3 = 2 A 4 

The dual problem can now be written as 
Maximize u(Ai, A 2 , A 3 , A 4 ) 



Since the maximization of v is equivalent to the maximization of In v, we will maximize 
In v for convenience. Thus 


In v = (2 - 1 1 A 4 )[ln 100 — ln(2 - 11A 4 )] + ( 8 A 4 - 1) 
x [In 50 - ln (8 A 4 - 1)] + 2A 4 [ln20 - ln(2A 4 )] 
+ A 4 [ln300-ln(A 4 )] 
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Since In v is expressed as a function of A4 alone, the value of A4 that maximizes In v 
must be unique (because the primal problem has a unique solution). The necessary 
condition for the maximum of In v gives 


9 11 

(In v) = — 1 1 [In 100 — ln(2 — llA 4 )] + (2- 11A 4 ) 


9 A 4 


+ 8 [In 50 - ln(8A 4 - 1)] + (8A4 - 1) - 


2 — 1 1 A 4 
8 

8A4 - 1 


+ 2 [In 20 — ln(2A 4 )] + 2A 4 


2A 4 


+ 1 [In 300 — ln(A 4 )] + A 4 


A 4 


= 0 


This gives after simplification 


in (2 -IIA 4) 11 (100) 11 

(8A 4 - 1) 8 (2A 4 ) 2 A 4 (50) 8 (20) 2 (300) 


i.e., 


(2 -IIA 4) 11 (100) 11 

(8A 4 - 1) 8 (2A 4 ) 2 A 4 “ (50) 8 (20) 2 (300) 


= 2130 


(E 3 ) 


from which the value of AJ can be obtained by using a trial-and-error process as 
follows: 


Value of A* 

Value of left-hand side of Eq. (E3) 

2/11 =0.182 

0.0 


0.15 

(0.35) 1 1 

~ 284 

(0.2) 8 (0. 3) 2 (0.15) 

0.147 

(0.385) 1 1 

~ 2210 

(0.175) 8 (0.294) 2 (0.147) 

0.146 

(0.39) 11 

~ 4500 

(0. 169) 8 (0.292)2 (0. 146) 


Thus we find that A^ ~ 0.147, and Eqs. (E 3 ) give 


A* = 2 — 11 AJ = 0.385 
Aj = 8 A 4 - 1 = 0.175 
A 3 = 2AJ = 0.294 


The optimal value of the objective function is given by 

, 0.385 / cn \ 0.175 

V* = f* = ' 


100 y 

0.385 ) 


50 


0.175 


20 

0.294 


0.294 


300 Y ' 147 

0.147/ 


= 8.5 x 2.69 x 3.46 x 3.06 = 242 
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The optimum values of the design variables can be found from 
U* = A If* = (0.385)(242) = 92.2 
C/ 2 * = A If* = (0.175) (242) = 42.4 
C /3 = A If* = (0.294) (242) =71.1 
C/ 4 * = A * A f* = (0.147) (242) = 35.6 
From Eqs. (Ej ) and (E 4 ), we have 

C/f = 100 D* = 92.2 
U* = 50 D* 2 = 42.4 


C/ 3 * = 


u; = 


20 
Q* 
300 Q 

D* 5 


= 71.1 


*2 


= 35.6 


(E 4 ) 


These equations can be solved to find the desired solution D* — 0.922 cm, Q* 
0.281 m 3 /s. 


8.7 CONSTRAINED MINIMIZATION 

Most engineering optimization problems are subject to constraints. If the objective 
function and all the constraints are expressible in the form of posynomials, geometric 
programming can be used most conveniently to solve the optimization problem. Let 
the constrained minimization problem be stated as 


Find X = 


X] 

x 2 


x n 


which minimizes the objective function 


N 0 n 

f<n = £cojfif» 


. 7=1 «= 1 


(8.52) 


and satisfies the constraints 

N k n 

g k (X) = Y, C kj ]~[ x“ k,J § 1 , k — 1,2, ... ,m (8.53) 

.7 = 1 i = l 

where the coefficients coj ( j = 1,2,..., No) and Ckj (k — 1,2, , m\ j = 1,2, 
. . . , Nk) are positive numbers, the exponents aoij (i — 1, 2, . . . , n\ j — 1,2, , Nq) 
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and cikij (k = 1 , 2, . . . , m ; i = 1 , 2, . . . , n ; j = 1 , 2, . . . , Nk) are any real numbers, 
m indicates the total number of constraints, Nq represents the number of terms in 
the objective function, and Nk denotes the number of terms in the £th constraint. 
The design variables x\, X 2 , ■ ■ ■ , x„ are assumed to take only positive values in 
Eqs. (8.52) and (8.53). The solution of the constrained minimization problem stated 
above is considered in the next section. 


8.8 SOLUTION OF A CONSTRAINED GEOMETRIC 
PROGRAMMING PROBLEM 

For simplicity of notation, let us denote the objective function as 

No n 

*o = £o(X) = /(X) = c oj f~[ x" 0 ' 1 (8.54) 

i=l j= 1 

The constraints given in Eq. (8.53) can be rewritten as 

fk = o*[l - g*(X)] > 0, k — 1 , 2, . . . , m (8.55) 


where o>, the si gnum function, is introduced for the A:th constraint so that it takes on 
the value +1 or —1, depending on whether g*(X) is < 1 or >1, respectively. The 
problem is to minimize the objective function, Eq. (8.54), subject to the inequality 
constraints given by Eq. (8.55). This problem is called the primal problem and can be 
replaced by an equivalent problem (known as the dual problem) with linear constraints, 
which is often easier to solve. The dual problem involves the maximization of the dual 
function, u(A.), given by 


v 


m N k [ Nk \ 

<«=nn fei>) 

k = 0 ; = 1 \ KJ 1 = 1 / 


Ok^kj 


k = U 7 = 1 

subject to the normality and orthogonality conditions 


No 

x °j = 1 

;= i 

m Nk 

EE — 0, i — 1,2,...,/? 

k = 0 7=1 


(8.56) 


(8.57) 

(8.58) 


If the problem has zero degree of difficulty, the normality and orthogonality conditions 
[Eqs. (8.57) and (8.58)] yield a unique solution for X* from which the stationary value 
of the original objective function can be obtained as 


m Nk 


r = =nn 

k = 0 7 = 1 



Nk 


a kK 


E 1 

i = l 


kl 


(8.59) 
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If the function /(X) is known to possess a minimum, the stationary value /* given 
by Eq. (8.59) will be the global minimum of / since, in this case, there is a unique 
solution for X*. 

The degree of difficulty of the problem (D) is defined as 

D = N - n - 1 (8.60) 

where N denotes the total number of posynomial terms in the problem: 

m 

N = ^N k (8.61) 

k = 0 

If the problem has a positive degree of difficulty, the linear Eqs. (8.57) and (8.58) 
can be used to express any (n + 1) of the X k f s in terms of the remaining D of the 
Xkj’s. By using these relations, v can be expressed as a function of the D independent 
Xicj’s. Now the stationary points of v can be found by using any of the unconstrained 
optimization techniques. 

If calculus techniques are used, the first derivatives of the function v with respect 
to the independent dual variables are set equal to zero. This results in as many simul- 
taneous nonlinear equations as there are degrees of difficulty (i.e., N —n — 1). The 
solution of these simultaneous nonlinear equations yields the best values of the dual 
variables, X*. Hence this approach is occasionally impractical due to the computa- 
tions required. However, if the set of nonlinear equations can be solved, geometric 
programming provides an elegant approach. 


Optimum Design Variables. For problems with a zero degree of difficulty, the solu- 
tion of X* is unique. Once the optimum values of X k j are obtained, the maximum of the 
dual function v* can be obtained from Eq. (8.59), which is also the minimum of the pri- 
mal function, /*. Once the optimum value of the objective function f* — Xq is known, 
the next step is to determine the values of the design variables x* (i = 1,2, , n ). 
This can be achieved by solving simultaneously the following equations: 


C 0 ; fl (**r 0ij 


o * 

Ill 

i=i 

x * 

7 = 1,2,.. 

.,N 0 

( 8 . 62 ) 

y * 
k kj 

N k 

n 

= c kj Y\(x*r kij , 

7 = 1 , 2 ,. 

..,N k 

( 8 . 63 ) 

EKi 

i=i 

i= 1 

k =1,2,. 

. . , m 



8.9 PRIMAL AND DUAL PROGRAMS IN THE CASE 
OF LESS-THAN INEQUALITIES 

If the original problem has a zero degree of difficulty, the minimum of the primal 
problem can be obtained by maximizing the corresponding dual function. Unfortunately, 
this cannot be done in the general case where there are some greater than type of 
inequality constraints. However, if the problem has all the constraints in the form of 
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g*(X) < 1, the signum functions a k are all equal to +1, and the objective function 
go(X) will be a strictly convex function of the transformed variables w\, w 2 , . . . , w n , 
where 

Xj — e wl , i = 0, 1, 2, . . . , n (8.64) 

In this case, the following primal-dual relationship can be shown to be valid: 

/(X) >f* = v*> v(X) (8.65) 

Table 8.2 gives the primal and the corresponding dual programs. The following char- 
acteristics can be noted from this table: 

1. The factors c kj appearing in the dual function v(k) are the coefficients of the 
posynomials g*(X), k = 0, 1,2,..., m. 

2. The number of components in the vector X is equal to the number of terms 
involved in the posynomials go, gi, g 2 , ■ ■ ■ , gm- Associated with every term in 
g k (X), there is a corresponding A k j. 

3. Each factor (^/St M-/ j ’ of v (X ) comes from an inequality constraint g k ( X ) < 
1. No such factor appears from the primal function go(X) as the normality 
condition forces At; t0 unity- 

4. The coefficient matrix [a k ,j \ appearing in the orthogonality condition is same 
as the exponent matrix appearing in the posynomials of the primal program. 

The following examples are considered to illustrate the method of solving geometric 
programming problems with less-than inequality constraints. 

Example 8.3 Zero-degree-of-difficulty Problem Suppose that the problem considered 
in Example 8.1 is restated in the following manner. Minimize the cost of constructing 
the open rectangular box subject to the constraint that a maximum of 10 trips only are 
allowed for transporting the 80 m 3 of grain. 


SOLUTION The optimization problem can be stated as 


Find X = 


xi ’ 
x 2 


x 3 


so as to minimize 


f (X) — 20x1*2 + 40*2*3 + 80*1*2 


subject to 

80 8 
< 10 or < 1 

*1*2*3 *1*2*3 

Since n = 3 and N — 4, this problem has zero degree of difficulty. As A)) = 3, /V] = 1, 
and m — 1, the dual problem can be stated as follows: 

^•01 
^■02 
^■03 

Ml 


Find X 


to maximize 
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Table 8.2 Corresponding Primal and Dual Programs 


Primal program 


Find X = 


X 1 

X2 


so that 


go(X) = /(X) -> minimum 


subject to the constraints 


x\ > 0 

X2 > 0 


x„ > 0, 

gi(X) < 1 

«2(X) < 1 


g,n(X) < 1, 


Dual program 

^-01 

^02 


X ON 0 


^11 

^-12 


Find X = 


XlN t 


Kii 

Xm2 


X mNm 


so that 


m 


v(X) = n 

k=0 


Nk 

n 


7=1 



maximum 


k tj 


with 

N 0 

/V \ X ' a 01j a 02 j a 0 nj 

go(X) = 2_, c 0j x \ x 2 ■■' X n 
7=1 

Ni 

(X \ _ X' r . . x a ' l i x an i . . . x a ' n i 
6lv A / — / j*\ Xn 

7=1 

N 2 

„ /V \ — V r . y a21 J y ai2 i y“ 2n J 

g 2tA ) — / t C2 j x i 7^2 X n 

7=1 

Nm 

n /Vl _ V/' . “mu “'"2J a *nnj 

gin , CmjX^ X9 -M! 

7=1 


subject to the constraints 

^■01 > 0 
X 02 > 0 

■Wo > 0 

Ai 1 > 0 
XlNi — 0 
X m i > 0 

X„i2 > 0 

XmN m > 0 


{continues) 
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Table 8.2 ( continued ) 

Primal program Dual program 

N 0 

5Z x oj = 1 
1=1 
m N k 

T, E a kijhj =0, i = 1, 2 n 

k=0 j= 1 

the factors cjy are positive, and the 
coefficients are real numbers. 

Terminology 

go = f = primal function 
jci, * 2 , . . . , x n = primal variables 
g, t < 1 are primal constraints 

(k = 1, 2, . . . , m) 

Xi > 0, i = 1, 2, . . . , n positive restrictions. 
n = number of primal variables 
m = number of primal constriants 
N = No + N\ +■■■ + N m = total number 
of terms in the posynomials 
N — n — 1 = degree of difficulty of the 
problem 


v = dual function 

Xoi , > 2 , • . • , T m Nm = dual variables 
No 

Xq j = 1 is the normality constraint 
}= i 

m Nk 

Y2 a kijXkj =0, i = 1, 2, . . . , n are the 

k=0j=\ 

orthogonality constraints 

X kj > 0, j = 1, 2, . . . , N k ; 
k = 0, 1, 2, . . . , m 

are nonnegativity restrictions 

N = Nq + Ni +■■■ + N m 
= number of dual variables 
n + 1 number of dual constraints 


the exponents a*y are real numbers, and 
the coefficients c k j are positive numbers. 


v(k) = 



Nk \ > 

J2 kki ) 

l=\ / 



Coi 

>01 + ^02 + ^ 03 ) 

>01 

^-01 

T — >01 + ^-02 + ^ 03 ) 
X02 

T — >01 + ^02 + ^ 03 ) 
>03 

3-03 



subject to the constraints 


(Ei) 


^•01 + ^02 + ^03 = 1 
flOlAoi + # 012^02 + « 013^-03 + flll>ll = 0 
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0021^01 + # 022^-02 + 6 * 023^-03 + ^ 121^11 = 0 

#031^01 + «032^02 + «033^03 + « 1 3 1 1 1 = 0 (Ei) 

A 0 ;>0, j = 1,2, 3 

An > 0 


In this problem, coi = 20, C 02 = 40, C 03 = 80, cn = 8 , non = 1, £*021 = 0, 0031 = 1, 
a oi2 = 0, 0022 = 1) #032 = 1, aoi3 = 1, «023 = 1> «033 = 0, dm = —1, fli2i = — 1, and 
t/131 = — 1. Hence Eqs. (E| ) and (Ei) become 


v(k) = 


'20 

^■01 

'40 

- — (Aoi + A 02 + A 03 ) 


(A-01 + ^-02 + A 03 ) 

>01 


_ A 02 


X 


(^01 + ^-02 + ^ 03 ) 

A 03 




A 11 


(E 3 ) 


subject to 


^01 + ^-02 + A03 = 1 
Aoi + A03 — An — 0 

A02 + Ao 3 — An — 0 (E4) 

^01 + A02 — An =0 

A01 > 0, A 02 > 0, A03 > 0, An > 0 

The four linear equations in Eq. (E4) yield the unique solution 

y * y * y * 1 y 2 

A 01 — a 02 — a 03 — 3 ’ A 11 — 3 

Thus the maximum value of v or the minimum value of x (l is given by 
v* = x* = ( 60) 1 / 3 ( 1 20 ) 1/3 ( 240) 1/3 (8 ) 2/3 
= [( 60) 3 ] 1/3 (8) 1/3 (8) 2/3 = ( 60 ) (8) = 480 


The values of the design variables can be obtained by applying Eqs. (8.62) and (8.63) 
as 


> * 
A oi 


coi(x* l ) a0n (x*) am (x*) am 


20(x*)(x*) _ x*x* 


480 


24 


1 * 

A 02 — 


co2(^i) a012 (-t:p a022 (x|) a032 


40(x|)(x|) _ x%x% 


(E 5 ) 


(E 6 ) 


1 

3 


480 


12 
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X 


* 

03 — 


cosCrjT 013 (* 2 t 023 (* 3 t 033 


x 


* 

0 


! 80(xp(xf) x^xf 

3 ~~ 480 ~~ ~6~ 


7 ^- = c n (x*) an \x2) aUl (x^) am 
A n 


i = 8(* 1 *r 1 (x 2 *r 1 (x 3 *r 1 


8 

Xj’Afxf 


(E 7 ) 


(Eg) 


Equations (E 5 ) to (Eg) give 


x* = 2, x| = 1, x 3 *=4 


Example 8.4 One-degree-of-difficulty Problem 

Minimize / = xixfxj -1 + 2xj~' x^ 3 X 4 + IOX 1 X 3 


subject to 

3 x^~ l X 3 X ^ 2 + 4.13X4 < 1 
5X1X2 < 1 

SOLUTION Here N 0 = 3, Ni = 2, N 2 = 1, N = 6, n = 4, m = 2, and the degree 
of difficulty of this problem is N — n — 1 = 1. The dual problem can be stated as 
follows: 

m Nk 

Maximixze v(X) = nn 
k= 0 j= 1 

subject to 

N 0 

x oj = 1 
j= 1 

m Nk 

EE ^kij X kj — 0 , 

k = 0 7=1 

N k 

kk i - °’ 

7=1 

As c 0 i = 1, C 02 = 2, C 03 = 10, cn = 3, C 12 = 4, C 21 = 5, $011 = 1» <2021 = 2, «03t = 

— 1 , <3041 = 0 , $012 — — 1. #022 = —3, $032 = 0 , $042 = L $013 = h «023 = 0, $033 = 

1, $043 = 0, $m = —1, $121 = 0, $131 = 1, $141 = —2, $H 2 = 0, $122 = 0, $132 = 1, 


1 = 1 , 2 ,...,$ (El) 

k — 1 , 2 , ,m 


hif-t 
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<7142 = 1 , <7211 = 1 , 77221 = 1 , <7231 = 0 , and fl 2 4 i = 0 , Eqs. (Ej) become 


Maximize n(A) = 

7— (A 01 + A 02 + A 03 ) 

7-01 

7^— (Aoi + A 02 + A 03 ) 


_ Aoi 


_ A 02 


X 

Cf)3 

- — (A 01 + A 02 + A 03 ) 

7-03 

- — (A 11 + A 12 ) 


. Aq3 


LA 11 J 


x 


— —(An + A12) 
.^12 



a 2 i 


subject to 


Aoi + <*-02 + A 03 — 1 


77011^-01 + <7012^-02 + ^013^-03 + <7111^11 + 77 1 12-^12 + 77211^21 = 0 

<7021^-01 + <2022^-02 + 77023^-03 + <7121^11 + 77122^12 + 77221-^21 = 0 

77031^01 + <7032^-02 + 77033^-03 + <7131^-11 + 77132^12 + <723^21 = 0 

77041-^-01 + <7042^-02 + C'043^03 + 77141-^-11 + 77 142-^12 + <7241^21 = 0 

An + An > 0 


A21 > 0 


or 


Maximize n(A) = 



x 


r 4 l Al2 , 

7— (*11+ *12) ( 5 ) Xzi 

_Ai 2 


subject to 


3 

- — (An + A12) 

An 



Aoi + A()2 + ^03 = 1 
Aoi — A 02 + ^-03 — Ml + Ml = 0 
2 Mi — 3 Ao 2 + Mi = 0 
— Mi + M 3 + Mi + M 2 — 0 
M2 — 2 A.ii + M2 — 0 

An + A 12 > 0 
A 21 > 0 


(E 2 ) 


(E 3 ) 


Equations (E 3 ) can be used to express any five of the A’s in terms of the remaining 
one as follows: Equations (E 3 ) can be rewritten as 

M2 + M3 — l — Mi 


M2 — M3 + A11 — A21 = A01 
3 Aq 2 — A21 = 2 Aqi 


(E 4 ) 

(Es) 

(E 6 ) 


8.9 Primal and Dual Programs in the Case of Less-Than Inequalities 517 


Al2 = Aoi — ^03 “ An (E7) 

A12 = 2A.ii — A 02 (Eg) 

From Eqs. (E7) and (Ex), we have 

A12 = A01 — A 0 3 — An = 2An — A02 

3 a 1 1 — A02 + Aq3 = A01 (E9) 

Adding Eqs. (E5) and (Eg), we obtain 

A21 = 4 A.ii — 2 Aoi (E10) 

= 3A 0 2 - 2 A.oi from Eq. (E 6 ) 

A11 = 5A02 (Eh) 

Substitution of Eq. (En) in Eq. (Eg) gives 

Al2 = 5A02 — A02 = 5A02 (E12) 

Equations (En), (E12), and (E7) give 

A03 = A01 — An — An = A01 — ^A 0 2 — 5A02 = A01 — jjA 0 2 (E13) 

By substituting for A03, Eq. (E4) gives 

A 0 2 = 8A01 — 4 (E14) 

Using this relation for A02, the expressions for A03, An, A12, and A.21 can be obtained 

as 


A03 = A01 — 5A02 = — 9Aoi + 5 

An = 5A02 = 6A01 — 3 

A12 = 3A02 = 4 A 0 i — 2 

A21 = 4A-i 1 — 2A-oi = 22Aoi — 12 

Thus the objective function in Eq. (E2) can be stated in terms of A01 as 

2 \ 82-01 —4 / jq \ 5-97.Q1 


(E15) 

(Eie) 

(En) 

(Eig) 


v(Aoi) - I —) 


A01 / V^Aoi — 4 / \5 — 9 Aoi 

62.01-3 / a _on\ 4A oi- 2 


/30A 0 i - 15y A01 “' s /40A 0 i -20\ 
\ 6A01 — 3 / V 4Aqi — 2 / 


1 


1 y°> 

A01 / \4Aoi — 2 


8Aqi — 4 


10 


5 - 9A 0 i 


(5)22^01 — 12 
5-92qi 


x (5f 


6Aoi-3^2Q)4Aoi-2^422Aoi-12 


(5) z 


1 


1 y°> 

Aqi / \4Aoi — 2 


82qi— 4 


-2»_j 

5 9/-n / 


5-92oi 


(5)322°! - 17 (2) 4 2qi -2 
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To find the maximum of v, we set the derivative of v with respect to lot equal to 
zero. To simplify the calculations, we set d (In v) /c/z-oi = 0 and find the value of A ( * , . 
Then the values of Aq.,, Aq 3 , A* p A* 2 , and A 2] can be found from Eqs. (E14) to (Eis). 
Once the dual variables (X* k j ) are known, Eqs. (8.62) and (8.63) can be used to find 
the optimum values of the design variables as in Example 8.3. 


8.10 GEOMETRIC PROGRAMMING WITH MIXED 
INEQUALITY CONSTRAINTS 

In this case the geometric programming problem contains at least one signurn function 
with a value of 07. = — 1 among k = 1,2,...,/;?. (Note that 00 = +1 corresponds to 
the objective function.) Here no general statement can be made about the convexity 
or concavity of the constraint set. However, since the objective function is continuous 
and is bounded below by zero, it must have a constrained minimum provided that there 
exist points satisfying the constraints. 

Example 8.5 

Minimize / = x\x\x 2 1 + 2xj _1 xj _3 X4 + IOX1X3 

subject to 

3xiXj 1 x$ + 4xj _1 x 4 “ 1 > 1 
5xix 2 < 1 


SOLUTION In this problem, m — 2, No = 3. Aj = 2, N 2 — 1 , N = 6, n — 4, and the 
degree of difficulty is 1. The signurn functions are ctq = 1, 04 = — 1, and 02=1. The 
dual objective function can be stated, using Eq. (8.56), as follows: 


2 Nk I c Nk 

Maximize u(A) = ]""[ ]""[ ( 7^ Xkl ) 
k = 0 7=1 \ k i i =\ J 


C01 

T >01 + ^02 + ^03) 

>01 


-O 01 r 


7^ >01 + Aq 2 + A 03 ) 
>02 


>02 


7— >01 + A. 02 + A03) 
>03 


>03 


X 

7 (All + A12) 

-All 

- — (An + A12) 


>11 


>12 


1 \ A °> / 2 V 02 / 10 V 03 

Aoi / \Aq2 / \Aq3 / 


3>n + A 12 ) 

A11 



->.11 


^21 


4(An + A12) 


G2 


(5)* 21 


(Ei) 


x 
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The constraints are given by (see Table 8 . 2 ) 

N 0 

E ; -o./ = 1 

7 = 1 

m Nk 

EE &k@kij ^kj — 0, i — 1, 2 , . . . , fl 

k = 0 7 = 1 
N k 

E] A*/ — k = 1 , 2 , . . . , m 

7=1 

that is, 

Aoi + ^02 + ^03 = 1 

ooflonAoi + abfloi 2^02 + croflon^oa + o-iamA.ii + oiamA^ + 0-2021^21 = 0 

oo^tmAoi + ao a 022^02 + o' 0O023A03 + oiamAn + oia^An + 02022^21 = 0 

0-oao3lA-oi + 0 - 00032^02 + O0O033A03 + O1O131A11 + O1O132A12 + O2O231A21 = 0 

O0O041A01 + O0O042A02 + 000043 ^03 + OiOi 4 iA.il + O1O142A12 + Cr 2 a 241^21 = 0 

A-n + A12 > 0 

A21 > 0 

that is, 

A01 + A 02 + A.03 = 1 
Aoi — A02 + A03 — An + A21 =0 

2Aoi — 3 Ao 2 + A21 =0 (E2) 

— A01 + A03 + An + A12 = 0 
A02 — 2An + A12 = 0 
A11 + A12 > 0 
A21 > 0 

Since Eqs. (E2) are same as Eqs. (E3) of the preceding example, the equality constraints 
can be used to express A02, A03, An, A^, and A21 in terms of A01 as 

A02 = 8A01 — 4 

A03 = — 9 Aoi + 5 

A11 = 6A 0 i — 3 (E3) 

A12 = 4 Aoi — 2 
A21 = 22 Aoi — 12 
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By using Eqs. (E3), the dual objective function of Eq. (Ej) can be expressed as 


u(*ot) = ( ^-) 


8 Aqi — 4 


8A.Q1-4 


10 


— 9Aqi + 5 


5— 9Aoi 


"3(10A 01 — 5)" 

— 6A.Q1+3 

"4(10A 01 — 5)" 

1 

o\ 

>- 

0 

1 

1 


1 

<N 

1 

0 

1 


-1 — 4A.Q1+2 


(5) 


22A.Q1 —12 


-'.ir 


1 


4Aqi — 2 


x (5) 


222.Q1 — 12 


1 V m 

^01 / 


1 


4A.oi — 2 


82oi“4 


82qi~4 


10 


5 — 9Aqi 


5-92qi 


(5) 


3—62-01 


( 10 ) 


2— 42qi 


10 


5 — 9Aqi 


5-92 01 


(5) 


122 - 01— 7 ( 2) 2_4A()1 


To maximize v, set d (In v)/dk w = 0 and find a* : . Once is known, A.?- can be 
obtained from Eqs. (E3) and the optimum design variables from Eqs. (8.62) and (8.63). 


8.11 COMPLEMENTARY GEOMETRIC PROGRAMMING 

Avriel and Williams [8.4] extended the method of geometric programming to include 
any rational function of posynomial terms and called the method complementary geo- 
metric programming . 1 The case in which some terms may be negative will then become 
a special case of complementary geometric programming. While geometric program- 
ming problems have the remarkable property that every constrained local minimum is 
also a global minimum, no such claim can generally be made for complementary geo- 
metric programming problems. However, in many practical situations, it is sufficient 
to find a local minimum. 

The algorithm for solving complementary geometric programming problems con- 
sists of successively approximating rational functions of posynomial terms by posyn- 
omials. Thus solving a complementary geometric programming problem by this algo- 
rithm involves the solution of a sequence of ordinary geometric programming problems. 
It has been proved that the algorithm produces a sequence whose limit is a local 
minimum of the complementary geometric programming problem (except in some 
pathological cases). 

Let the complementary geometric programming problem be stated as follows: 

Minimize Rq(X) 


subject to 


R k (X) < 1, k = 1,2, ... ,m 


where 


fl,(X) = 


A a (X) - £,(X) 
C,(X)- At(X)’ 


k — 0, 1,2 , ,m 


(8.66) 


^The application of geometric programming to problems involving generalized polynomial functions was 
presented by Passy and Wilde [8.2], 
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where A k (X), B k (X), C k (X ). and D k (X) are posynomials in X and possibly some of 
them may be absent. We assume that Rq(X) >0 for all feasible X. This assumption 
can always be satisfied by adding, if necessary, a sufficiently large constant to Rq(X). 

To solve the problem stated in Eq. (8.66), we introduce a new variable xq > 0, 
constrained to satisfy the relation xq > Rq(X) [i.e., R<)(X )/xo < 1], so that the problem 
can be restated as 


Minimize xq 


(8.67) 


subject to 


A k (X) - B k (X) < i 
Ck (X ) - D k (X) - ’ 


k — 0, 1,2, ... ,m 


where 

A 0 (X) = /? 0 (X), C 0 (X) = x 0 , B 0 (X)=0, and D 0 (X)=0 


( 8 . 68 ) 


It is to be noted that the constraints have meaning only if C k (X ) — D k (X ) has a constant 
sign throughout the feasible region. Thus if C k (X) — D k ( X ) is positive for some feasible 
X, it must be positive for all other feasible X. Depending on the positive or negative 
nature of the term Q (X) — D k ( X ) , Eq. (8.68) can be rewritten as 

A t (X) + P t (X) < 
flt(X) + C*(X) “ 

or (8.69) 

g,(X) + C,(X) < 

A k (X) + D k (X) ~ 

Thus any complementary geometric programming problem (CGP) can be stated in 
standard form as 

Minimize xq (8.70) 


subject to 


ft-(X) 

Qk(X) 


< 1. 


k — 1,2 , ... ,m 


*o' 

xi 


X = 


*2 


>0 


Xn 


where P k ( X ) and Q k (X) are posynomials of the form 

n 

ATX) - c kj [] ( Xi ) akij = J2 A;(X) 
j i=0 j 

n 

Q k (X) = J2 d k,Y\ (*«■ ) M ' 7 = E «y (X ) 

1=0 


(8.71) 


(8.72) 


(8.73) 


j 


j 


(8.74) 
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Solution Procedure. 

1. Approximate each of the posynomials Q(X ) : by a posynomial term. Then all 
the constraints in Eq. (8.71) can be expressed as a posynomial to be less than 
or equal to 1. This follows because a posynomial divided by a posynomial 
term is again a posynomial. Thus with this approximation, the problem reduces 
to an ordinary geometric programming problem. To approximate <2(X) by a 
single-term posynomial, we choose any X > 0 and let 


Uj = qj (X ) 
A _ g./(X) 
7 G(X) 


(8.75) 

(8.76) 


G(X) 


(8.77) 


where qj denotes the / th term of the posynomial Q(X ). Thus we obtain, by 
using the arithmetic -geometric inequality, Eq. (8.22), 

1 ?;(X)/G(X) 

Gw = £> ( x) > n 

j i 

By using Eq. (8.74), the inequality (8.77) can be restated as 

/ x \Y.jlbij qj <Z)/Qm 

G(X)>g(X,X) = e(X)m-M (8.78) 


<?/ (X ) 

qj (X ) 


where the equality sign holds true if x,- = x,-. We can take <2(X,X) as an 
approximation for <2(X) at X. 

2. At any feasible point X a) , replace <2r(X) in Eq. (8.71) by their approximations 
G,t(X, X {1) ), and solve the resulting ordinary geometric programming problem 
to obtain the next point X <2) . 

3. By continuing in this way, we generate a sequence {X <0f) }, where X ( “ +1) is an 
optimal solution for the ath ordinary geometric programming problem (OGP„): 


Minimize xq 


subject to 


P ‘ IX > < i 

S,(X,x<“>) " 


k — 1,2 , ,m 


X = 


x 0 ' 

X] 


X2 


> 0 


x„ 


(8.79) 


It has been proved [8.4] that under certain mild restrictions, the sequence of 
points {X ( °7} converges to a local minimum of the complementary geometric 
programming problem. 


+ The subscript k is removed for Q(X ) for simplicity. 
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Degree Of Difficulty. The degree of difficulty of a complementary geometric pro- 
gramming problem (CGP) is also defined as 

degree of difficulty = N — n — 1 

where N indicates the total number of terms appearing in the numerators of Eq. ( 8 . 71 ). 
The relation between the degree of difficulty of a CGP and that of the OGP a , the 
approximating ordinary geometric program, is important. The degree of difficulty of a 
CGP is always equal to that of the approximating OGPq,, solved at each iteration. Thus 
a CGP with zero degree of difficulty and an arbitrary number of negative terms can 
be solved by a series of solutions to square systems of linear equations. If the CGP 
has one degree of difficulty, at each iteration we solve an OGP with one degree of 
difficulty, and so on. The degree of difficulty is independent of the choice of X (Qf) and 
is fixed throughout the iterations. The following example is considered to illustrate the 
procedure of complementary geometric programming. 


Example 8.6 


Minimize X\ 


subject to 


-4*i + 4.r 2 < 1 

X\ + *2 > 1 
*1 > 0 , *2 > 0 


SOLUTION This problem can be stated as a complementary geometric programming 
problem as 


subject to 


Minimize x\ (Ej ) 


4 xi 

' < 1 

1+4*2 

(E 2 ) 

x 7 l 

L - j— ^ 1 

1 + *! * 2 

(E 3 ) 

*1 > 0 

(E 4 ) 

*2 >0 

(E 5 ) 


Since there are two variables (*i and *2) and three posynomial terms [one term in the 
objective function and one term each in the numerators of the constraint Eqs. (E2) and 
(E3)], the degree of difficulty of the CGP is zero. If we denote the denominators of 
Eqs. (E2) and (E3) as 

2 1 (X ) = 1 + 4 xf 
Q 2 (X) = l+x^x 2 
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they can each be approximated by a single-term posynomial with the help of Eq. (8.78) 
as 


<2t(X, X) = (1 +4.jj) ( —J 


\ 8xf/(l+4xf) 
2\ I \ ~ 


-X 2 /(X l +X 2 ) 


-TA 

X\ ) 


X 2 /(Xi+X 2 ) 


Let us start the iterative process from the point X (l) = {[}, which can be seen to be 
feasible. By taking X = X (1) , we obtain 

Qj(X, X (1) ) = 5xf /5 
£ 2 (X,X«) = 2x- 1/2 x^ /2 


and we formulate the first ordinary geometric programming problem (OGPi ) as 


Minimize x\ 


subject to 


8 / 5 x 2 < 1 


1/2 X - 1/2 


2 ^ 


< 1 


xi > 0 


Since this (OGPi) is a geometric programming problem with zero degree of difficulty, 
its solution can be found by solving a square system of linear equations, namely 


A.i = 1 

M — §A 2 — 5^3 = 0 
A.2 — 5A.3 = 0 

The solution is A.* = 1, a* 2 = a) = By substituting this solution into the dual 
objective function, we obtain 

«(**) = (|) 5/13 (A 10 / 13 - 0.5385 

From the duality relations, we get 

xi ~ 0.5385 and x 2 = f (jci) 8/15 ~ 0.4643 

Thus the optimal solution of OGPi is given by 

CD _ {0.5385 
A °pt - { 0.4643 
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Next we choose X ,2) to be the optimal solution of OGPi [i.e., X ] and approx- 
imate Q i and Qi about this point, solve OGP 2 , and so on. The sequence of optimal 
solutions of OGP (/ as generated by the iterative procedure is shown below: 


Iteration number, a 

Xl 

X 

O 

X2 

0 

1.0 


1.0 

1 

0.5385 


0.4643 

2 

0.5019 


0.5007 

3 

0.5000 


0.5000 


The optimal values of the variables for the CGP are x\ — 0.5 and x\ — 0.5. It can be 
seen that in three iterations, the solution of the approximating geometric programming 
problems OGP„ is correct to four significant figures. 


8.12 APPLICATIONS OF GEOMETRIC PROGRAMMING 


Example 8.7 Determination of Optimum Machining Conditions [8.9, 8.10] Geomet- 
ric programming has been applied for the determination of optimum cutting speed and 
feed which minimize the unit cost of a turning operation. 


Formulation as a Zero-degree-of -difficulty Problem 

The total cost of turning per piece is given by 


/o(X) = machining cost + tooling cost + handling cost 

= K m tm + —(K m t c + K , ) + K m th (Ei) 

where K m is the cost of operating time ($/min), K, the tool cost ($/cutting edge), 
t m the machining time per piece (min) = tt DL/(\2VF), T the tool life (min/cutting 
edge) = (a/VF h )^ c , t c the tool changing time (minutes/workpiece), f/, the handling 
time (min/workpiece), D the diameter of the workpiece (in), L the axial length of the 
workpiece (in.), V the cutting speed (ft/min), F the feed (in. /revolution), a, b, and c 
are constants in tool life equation, and 



Since the constant term will not affect the minimization, the objective function can be 
taken as 


/(X) = Coi V-'T- 1 + Q n V'^'F b / ,: -' (E 2 ) 


where 


Cot = 


K m tt DL 

12 


and C 02 


tt DL(K m t c + K t ) 
12 a l F 


(E 3 ) 
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If the maximum feed allowable on the lathe is E max , we have the constraint 


C\\F < 1 (E 4 ) 

where 

Cl! = ^ax (Es) 

Since the total number of terms is three and the number of variables is two, the degree 
of difficulty of the problem is zero. By using the data 

K m = 0.10, K, = 0.50, t c = 0.5, t h — 2.0, D = 6.0, 

L = 8.0, a = 140.0, b = 0.29, c = 0.25, E max = 0.005 

the solution of the problem [minimize / given in Eq. (Ei) subject to the constraint 
(E 4 )J can be obtained as 

f* — $1.03 per piece, V* — 323 ft/min, F* = 0.005 in. /rev 

Formulation as a One-degree-of-difficulty Problem 

If the maximum horsepower available on the lathe is given by P max , the power required 
for machining should be less than P max . Since the power required for machining can 
be expressed as ci\ V bl F c ' , where cq, b\, and c \ are constants, this constraint can be 
stated as follows: 


C2\V hl F Cl < 1 (E 6 ) 

where 

C 2 i=aiP~^ (E 7 ) 

If the problem is to minimize / given by Eq. (E 2 ) subject to the constraints (E 4 ) and 
(Eg), it will have one degree of difficulty. By taking P m ax = 2.0 and the values of a\, 
bi, and c 1 as 3.58, 0.91, and 0.78, respectively, in addition to the previous data, the 
following result can be obtained: 

/* = $1.05 per piece, V* = 290.0 ft/min, F* = 0.005 in. /rev 

Formulation as a Two-degree-of- difficulty Problem 

If a constraint on the surface finish is included as 

a 2 V b -F^ < S max 

where a 2 , b 2 , and C 2 are constants and ,S' max is the maximum permissible surface rough- 
ness in microinches, we can restate this restriction as 


C 31 V bl F C2 < 1 


(Eg) 
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where 

C31 = a 2 S~l x (E 9 ) 

If the constraint (Eg) is also included, the problem will have a degree of difficulty 
two. By taking a 2 = 1.36 x 10 8 , b 2 — —1.52, c 2 — 1.004, S max = 100 /rin., F max = 
0.01, and P max = 2.0 in addition to the previous data, we obtain the following result: 

/* = $1 . 1 1 per piece, V * = 3 1 1 ft/min, F* = 0.0046 in. /rev 


Example 8.8 Design of a Hydraulic Cylinder [8.11] The minimum volume design 
of a hydraulic cylinder (subject to internal pressure) is considered by taking the pis- 
ton diameter (cl), force (/), hydraulic pressure (p), stress (.s ) , and the cylinder wall 
thickness (J) as design variables. The following constraints are considered: 


Minimum force required is F, that is, 


nd 2 

f = P—> F 

Hoop stress induced should be less than S, that is, 


Side constraints: 


(Ei) 


P d ^ c 
s — — < S 
2 1 ~ 

(E 2 ) 

d -j- 2 1 ^ D 

(e 3 ) 

p < P 

(E 4 ) 

t > T 

(E 5 ) 


where D is the maximum outside diameter permissible, P the maximum pressure 
of the hydraulic system and T the minimum cylinder wall thickness required. 
Equations (Ei) to (E 5 ) can be stated in normalized form as 

— F p~ x d~ 2 < 1 

71 

pdt~ l < 1 

D~ l d + 2D~ l t < 1 
P~ l p < 1 
Tt~ l < 1 


The volume of the cylinder per unit length (objective) to be minimized is given by 
7it(d + f). 


Example 8.9 Design of a Cantilever Beam Formulate the problem of determining 
the cross-sectional dimensions of the cantilever beam shown in Fig. 8.2 for minimum 
weight. The maximum permissible bending stress is ay 
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P 



SOLUTION The width and depth of the beam are considered as design variables. 
The objective function (weight) is given by 

/(X) = plx x x 2 (EO 

where p is the weight density and 1 is the length of the beam. The maximum stress 
induced at the fixed end is given by 

er 

and the constraint becomes 

°V 


Me X2 1 

— = PI— 

1 2 72 X 1- X 2 


6 PI 

2 

X\X 2 


(e 2 ) 


6 PI 


x, *x, 2 < 1 


(E 3 ) 


Example 8.10 Design Of a Cone Clutch [8.23] Find the minimum volume design 
of the cone clutch shown in Fig. 1.18 such that it can transmit a specified minimum 
torque. 

SOLUTION By selecting the outer and inner radii of the cone, R\ and R 2 , as design 
variables, the objective function can be expressed as 

f(Ri , Ri) = \nh(Rj + R l R 2 + Rj) (Ej) 


where the axial thickness, h, is given by 

R\ — R 2 

n = 

tana 


Equations (Ej ) and (E 2 ) yield 

/(*i, Ri)=h {R\ -R\) 


(E 2 ) 

(E 3 ) 


h = 


7Z 

3 tana 


(E 4 ) 


where 
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The axial force applied ( F ) and the torque developed (T) are given by [8.37] 


F = 
T = 


/ p clA sin a — I 

Jr 2 

f rfpdA = [ Rl rfp 
J J Ri 


Rl 2nr dr 


p sin a = np(Rf — Ri) 

i 2 sin a 

Rl 2 nr , 27 ifp 


-dr — 


sin a 


3 sin a 


(Ri - Ri) 


(E 5 ) 

(e 6 ) 


where p is the pressure, / the coefficient of friction, and A the area of contact. 
Substitution of p from Eq. (E 5 ) into (Ef>) leads to 


where 


k 2 {R\ + RiR 2 + Rl) 
R\ ~\~ Ri 


k2 = ^L 

3 sin a 


(E 7 ) 

(Eg) 


Since k\ is a constant, the objective function can be taken as f — R\ — R 2 . The min- 
imum torque to be transmitted is assumed to be 5k 2 . In addition, the outer radius R\ 
is assumed to be equal to at least twice the inner radius R 2 . Thus the optimization 
problem becomes 

Minimize f(Ri, R 2 ) — R^ — R 2 


subject to 


R, + R 2 R 2 + Ri 
— > 5 

Ri + Ri 


Ri 

— >2 

Ri ~ 


(E 9 ) 


This problem has been solved using complementary geometric programming [8.23] 
and the solution was found iteratively as shown in Table 8.3. Thus the final solution is 
taken as = 4.2874, R* = 2.1437, and f* = 68.916. 


Example 8.11 Design Of a Helical Spring Formulate the problem of minimum 
weight design of a helical spring under axial load as a geometric programming prob- 
lem. Consider constraints on the shear stress, natural frequency, and buckling of the 
spring. 


SOLUTION By selecting the mean diameter of the coil and the diameter of the wire 
as the design variables, the design vector is given by 


X = 




(Ei) 


The objective function (weight) of the helical spring can be expressed as 

/(X) - ^-(nD)p(n + Q ) 


(E 2 ) 
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Table 8.3 Results for Example 8.10 


Iteration 


Starting 

Ordinary geometric programming 

Solution 

number 


design 

problem 

of OGP 

1 

Xl 

= Ro 

= 40 

Minimize xjx^x/ 

X! = 162.5 


x 2 

= R 1 

= 3 

subject to 

x 2 = 5.0 


x 3 

= Ri 

= 3 

0.507x“ 0 - 597 x^x 3 “ L21 < 1 
1.667(x 2 _1 +x 3 _1 ) < 1 

* 

U> 

II 

to 

Lo 

2 

X, 

= R 0 

= 162.5 

Minimize xjx/x/ 

xi = 82.2 


x 2 

= Ri 

= 5.0 

subject to 

x 2 = 4.53 


x 3 

= r 2 

= 2.5 

0.744x“°' 912 x|x 3 “ 0 - 2635 < 1 

3.05(x 2 -°- 4 V' 571 +x 2 - L43 x 3 °- 429 )<1 
2x^~ 1 x 3 < 1 

x 3 = 2.265 

3 

Xl 

= Ro 

= 82.2 

Minimize xjx^x/ 

X! = 68.916 


X2 

= R 1 

= 4.53 

subject to 

x 2 = 4.2874 


x 3 

= r 2 

= 2.265 

0.687x“°' 876 x 3 x 3 “°- 372 < 1 
1.924x?x- 0 - 429 x 3 -°- 571 + 
1.924x?x 2 - L492 x 3 0 ' 429 < 1 
2xT*x 3 < 1 

x 3 = 2.1437 


where n is the number of active turns, Q the number of inactive turns, and p the weight 
density of the spring. If the deflection of the spring is < 5 , we have 


S 


8 PC 3 n 
Gel 


or n 


GelS 
8 PC 3 


(E 3 ) 


where G is the shear modulus, P the axial load on the spring, and C the spring index 
(C = D/d). Substitution of Eq. (E 3 ) into (E2) gives 


/(X) - 


n 2 pGS d 6 
32 P D 2 



(E 4 ) 


If the maximum shear stress in the spring (r) is limited to r max , the stress constraint 
can be expressed as 


8 KPC 

T = -pr - T max Or 

nd~ 


8 KPC 
71 d 2 r max 


(Es) 


where K denotes the stress concentration factor given by 


K 


2 

5 


(E 6 ) 


The use of Eq. (Eg) in (E5) results in 


16 P D 3 / 4 
TTtm a xd U D 


(E 7 ) 
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To avoid fatigue failure, the natural frequency of the spring (/„) is to be restricted to 
be greater than (/„) m j n . The natural frequency of the spring is given by 


, 2 d (Gg\ 1/2 

' " nD 2 n \32p / 


(Eg) 


where g is the acceleration due to gravity. Using g — 9.81 m/s 2 , G — 8.56 x 
10 10 N/m 2 , and (/„) min = 13, Eq. (Eg) becomes 

n(f„) mm 8Gd 3 

< 1 ( CA) ) 

288,800 P D ~ 


Similarly, in order to avoid buckling, the free length of the spring is to be limited 
as 


Using the relations 


11. 5(73/2) 


L < 

“ P/K 1 

(Eio) 

Gd 4 

8D 3 n 

(En) 

L = nd( 1 + Z) 

(Ei 2 ) 


and Z = 0.4, Eq. (Eio) can be expressed as 


0.0527 




(Ei 3 ) 


It can be seen that the problem given by the objective function of Eq. (E4) and con- 
straints of Eqs. (E 7 ), (Eg), and (E13) is a geometric programming problem. 


Example 8.12 Design of a Lightly Loaded Bearing [8.29] A lightly loaded bearing 
is to be designed to minimize a linear combination of frictional moment and angle of 
twist of the shaft while carrying a load of 10001b. The angular velocity of the shaft is 
to be greater than 100 rad/s. 


SOLUTION 


Formulation as a Zero-Degree-of-Difficulty Problem 

The frictional moment of the bearing (M) and the angle of twist of the shaft (0) are 
given by 


M = 


8 7T 

v 7 1 — n 2 c 


R 2 L 


0 = 


S e l 

GR 


(Ei) 

(E 2 ) 


where /i is the viscosity of the lubricant, n the eccentricity ratio (= e/c), e the eccentric- 
ity of the journal relative to the bearing, c the radial clearance, £2 the angular velocity 
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of the shaft, R the radius of the journal, L the half-length of the bearing, S e the shear 
stress, 1 the length between the driving point and the rotating mass, and G the shear 
modulus. The load on each bearing (IT) is given by 

2u.Gl.RL n ^ oi/o 

W = -^ ^[n 2 (l - n 2 ) + I6n 2 ] 1 ' 2 (E 3 ) 

C"(l — n-y 

For the data W = 10001b, c/R = 0.0015, n = 0.9, 1= 10 in., = 30,000 psi, 
/A = 10 -6 lb-s/in 2 , and G — 12 x 10 6 psi, the objective function and the constraint 
reduce to 

f(R, L) = aM + b<p — 0.038 VR 2 L + 0.025iT 1 (E 4 ) 

£2i? _1 L 3 = 11.6 (E 5 ) 

> 100 (E 6 ) 

where a and b are constants assumed to be a — b = 1. Using the solution of Eq. (Eg) 
gives 

ST = 1 1 ,6RL~ 3 (E 7 ) 


the optimization problem can be stated as 

Minimize f(R, L) = 0.45 R i L~ 2 + 0.025 R~ l 


subject to 


8.62i?“ 1 L 3 < 1 


(Eg) 

(Eg) 


The solution of this zero-degree-of-difficulty problem can be determined as R* = 
0.212in„ L* = 0.291 in., and f* = 0.17. 


Formulation as a One-Degree-of-Difficulty Problem 

By considering the objective function as a linear combination of the frictional moment 
(M), the angle of twist of the shaft (0), and the temperature rise of the oil ( T), 
we have 


f = aM + b(f> + cT 


(Eio) 


where a, b , and c are constants. The temperature rise of the oil in the bearing is given 
by 


T = 0.045 


fitlR 1 

c 2 n y/ (1 — n 2 ) 


(En) 


By assuming that 1 in. -lb of frictional moment in bearing is equal to 0.0025 rad of angle 
of twist, which, in turn, is equivalent to 1 °F rise in temperature, the constants a, b, 
and c can be determined. By using Eq. (E7), the optimization problem can be stated 


(E12) 


as 


Minimize f(R, L) = 0.44 R 3 L~ 2 + 10 R~ l +0.592 RL~ 3 
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subject to 


8.62 R~ l L 3 < 1 


(Ei 3 ) 


The solution of this one-degree-of-difficulty problem can be found as R* = 1 .29, L* — 
0.53, and /* = 16.2. 


Example 8.13 Design of a Two-bar Truss [8.33] The two-bar truss shown in Fig. 8.3 
is subjected to a vertical load 2 P and is to be designed for minimum weight. The 
members have a tubular section with mean diameter d and wall thickness t and the 
maximum permissible stress in each member (op) is equal to 60,000 psi. Determine the 
values of h and d using geometric programming for the following data: P — 33,0001b, 
t — 0.1 in., b — 30 in., ero = 60,000 psi, and p (density) = 0.3 lb/in 3 . 


SOLUTION The objective function is given by 
f(d, h) — 2pndt\Jb 2 + h 2 

= 2(0.3)jrr/(0.1)V900 + /7 2 = 0. 188^900 + h 2 


The stress constraint can be expressed as 


or 


P v/900 + h 2 
mlt h 


< Of) 


33,000 V900 + h 2 

7i d( 0.1) h 


< 60,000 


(Ei) 


2 P 




Figure 8.3 Two-bar truss under load. 
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or 


175 V900TF 

clh 


< 1 


(E 2 ) 


It can be seen that the functions in Eqs. (E/) and (E 2 ) are not posynomials, due to the 
presence of the term V900 + h 2 . The functions can be converted to posynomials by 
introducing a new variable y as 


y = 7900 + h 2 or y 2 = 900 + h 2 


and a new constraint as 


900 + h 2 

y 2 


< l 


(e 3 ) 


Thus the optimization problem can be stated, with x\ — y, x 2 = h, and x 2 = d as 
design variables, as 


Minimize / = 0.188yd 

subject to 

1 J5yh~ l d~ l < 1 
900y“ 2 + y~ 2 h 2 < 1 


(E 4 ) 

(E 5 ) 

(E 6 ) 


For this zero-degree-of-difficulty problem, the associated dual problem can be stated 
as 


Maximize t>(X 0 i, Xu, X 2 i, X 22 ) 


/ 0.188\ Ao1 


1.75V 11 / 900 V 21 

V / \ ^21 / 


i V 22 

— J (Vi + a 22 7 21+A22 


subject to 


(E 7 ) 


a 0 i - 1 (Eg) 

7.01 + 7n — 2 A. 2 i — 2X 22 = 0 (E 9 ) 

-Xn+2X 22 = 0 (E 10 ) 

7oi — 7ji = 0 (En) 


The solution of Eqs. (Eg) to (En) gives Xj^ = I , X*, = 1 , Xjj = 
the maximum value of v and the minimum value of / is given by 


v — 


0.188 

1 


(1-75 ) 1 


900 \ 
05/ 


0.5 


1 

05 


0.5 


(0.5 + 0.5)°' 5+0 ' 5 


and X 22 = 7 Thus 
= 19.8 = /* 
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The optimum values of x,- can be found from Eqs. (8.62) and (8.63): 

0.188y*d* 

“ ET8 

1 = 1.753;*/2* _1 ^* _l 
\ = 900 v* -2 
\ = y*~ 2 h* 2 

These equations give the solution: y* — 42.426, h* — 30 in., and d* — 2.475 in. 

Example 8.14 Design of a Four-bar Mechanism [8.24] Find the link lengths of the 
four-bar linkage shown in Fig. 8.4 for minimum structural error. 

SOFUTION Fet a, b, c, and cl denote the link lengths, 9 the input angle, and 0 
the output angle of the mechanism. The loop closure equation of the linkage can be 
expressed as 

lad cos 6 — led cos 0 + {a 2 — b 2 + c 2 + d 2 ) 

— 2accos(0 — <p) — 0 (Ei) 

In function-generating linkages, the value of 0 generated by the mechanism is made 
equal to the desired value, 0^, only at some values of 6. These are known as precision 
points. In general, for arbitrary values of the link lengths, the actual output angle (0,) 
generated for a particular input angle (ft ) involves some error (e,) compared to the 
desired value (0^,), so that 


0 / — tpdi £/ 


(E 2 ) 


where £,■ is called the structural error at 0j. By substituting Eq. (Ft) into (Ei) and 
assuming that sin e, ~ s, and cos e,- ~ 1 for small values of we obtain 


K + lad cos 6j — led cos 9 c u — lac cos #,■ cos(<pdi — ft) 
—lac sin((j)di — ft) — led sin 0^- 


(E 3 ) 


where 


K = a 2 - b 2 + c 2 + d 2 


(E 4 ) 



Figure 8.4 Four-bar linkage. 
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The objective function for minimization is taken as the sum of squares of structural 
error at a number of precision or design positions, so that 

/ = I > 2 ces) 

;=i 

where n denotes the total number of precision points considered. Note that the error e ( - 
is minimized when / is minimized (e,- will not be zero, usually). 

For simplicity, we assume that a «rf and that the error e,- is zero at Go- Thus 
£q — 0 at Qj = Gq, and Eq. (E3) yields 

K — 2 cd cos (j> di + lac cos 6 0 cos(0 rfo — #o) — 2 ad cos Go (E 6 ) 


In view of the assumption a <£ d, we impose the constraint as (for convenience) 


3 a 

— < 
d ~ 


1 


(E 7 ) 


where any larger number can be used in place of 3. Thus the objective function for 
minimization can be expressed as 


/ = £ 


a 2 (cos Oj — cos6>o) 2 — 2ac(cos0,- — cos Go) (cos (j> c n — cos 4>do) 
c 2 sin 2 fan 


(Eg) 


Usually, one of the link lengths is taken as unity. By selecting a and c as the design 
variables, the normality and orthogonality conditions can be written as 

A* + A2 = 1 (Eg) 

2At + A*=0 (E 10 ) 

2A* + 0.5 AJ + A3 = 0 (En) 


These equations yield the solution A* = — 1, Aj = 2, and A| = 1, and the maximum 
value of the dual function is given by 


u(A*) = 



(E12) 


where c 1 , C2, and C3 denote the coefficients of the posynomial terms in Eqs. (E 7 ) 
and (Eg). 

For numerical computation, the following data are considered: 


Precision point, i 1 2 3 4 5 6 

Input, 0i (deg) 0 10 20 30 40 45 

Desired output, 0^,- (deg) 30 38 47 58 71 86 


If we select the precision point 4 as the point where the structural error is zero (Go — 30°, 
<t>do = 58°), Eq. (Eg) gives 


/ = 0.1563^ - 

c l 


(E13) 


c 
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subject to 


3 a 

— < 

cl ~ 


1 


Noting that c \ — 0.1563, ci = 0.76, and C 3 = 3 /cl, we see that Eq. (E 12 ) gives 


u(A) = 


0.1563 

-1 


— 0.76\ 2 /3 \ 1 j _ 2.772 

d ) (1) = d 


Noting that 


a - / 2.772 

0.1563— = I 

d 


(-D 


2.772 


a 2.772 5.544 

—0.76— = —(2) = — 

c d cl 


and using a — 1, we find that c* — 0.41 and d* — 3.0. In addition, Eqs. (Eg) and 
(E4) yield 


a 2 - b 2 + c 2 + d 2 

= 2 cd cos (pdo + 2 ac cos 6q cos((j) c io — do) — lad cos #0 

or b* — 3.662. Thus the optimal link dimensions are given by a* = 1, b* = 3.662, 
c* = 0.41, and d* = 3.0. 
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REVIEW QUESTIONS 

8.1 State whether each of the following functions is a polynomial, posynomial, or both. 

(a) / = 4 — x\ + 6 x 1 x 2 + 3x% 

(b) / = 4 + 2x\ + 5 * 1*2 + *| 

(c) / = 4 + 2xjxf 1 + 3xf 4 + 5*j _1 *| 

8.2 Answer true or false: 

(a) The optimum values of the design variables are to be known before finding the 
optimum value of the objective function in geometric programming. 

(b) A* denotes the relative contribution of the j th term to the optimum value of the 
objective function. 

(c) There are as many orthogonality conditions as there are design variables in a geometric 
programming problem. 

(d) If / is the primal and v is the dual, / < v. 

(e) The degree of difficulty of a complementary geometric programming problem is given 
by ( N — n — 1), where n denotes the number of design variables and N represents the 
total number of terms appearing in the numerators of the rational functions involved. 

(f) In a geometric programming problem, there are no restrictions on the number of 
design variables and the number of posynomial terms. 

8.3 How is the degree of difficulty defined for a constrained geometric programming problem? 

8.4 What is arithmetic-geometric inequality? 
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8.5 What is normality condition in a geometric programming problem? 

8.6 Define a complementary geometric programming problem. 


PROBLEMS 


Using arithmetic mean-geometric mean inequality, obtain a lower bound v for each function 
[/(x) > v, where v is a constant] in Problems 8. 1-8.3. 

_ . x~ 2 2 , 4 ,,, 

8.1 f(x) = + -x -3 + -x 3/2 

3 3 3 

8.2 f (x) = 1 + x H 1 — - 

X X A 

8.3 /(x) = jx~ 3 + x 2 + 2x 

8.4 An open cylindrical vessel is to be constructed to transport 80 m 3 of grain from a ware- 
house to a factory. The sheet metal used for the bottom and sides cost $80 and $10 per 
square meter, respectively. If it costs $1 for each round trip of the vessel, find the dimen- 
sions of the vessel for minimizing the transportation cost. Assume that the vessel has no 
salvage upon completion of the operation. 

8.5 Find the solution of the problem stated in Problem 8.4 by assuming that the sides cost 
$20 per square meter, instead of $10. 

8.6 Solve the problem stated in Problem 8.4 if only 10 trips are allowed for transporting the 
80 m 3 of grain. 

8.7 An automobile manufacturer needs to allocate a maximum sum of $2.5 x 10 6 between 
the development of two different car models. The profit expected from both the models 
is given by xj 5 X 2 , where x,- denotes the money allocated to model i ( i = 1, 2). Since 
the success of each model helps the other, the amount allocated to the first model should 
not exceed four times the amount allocated to the second model. Determine the amounts 
to be allocated to the two models to maximize the profit expected. Hint: Minimize the 
inverse of the profit expected. 

8.8 Write the dual of the heat exchanger design problem stated in Problem 1.12. 

8.9 Minimize the following function: 

/(X) = X\X2X 3 2 + 2Xj 1 X 2 *JC 3 + 5.V2 + 3xiX-i 2 

8.10 Minimize the following function: 

/(X) = \x 2 +X 2 + fxj-'xj 1 

8.11 Minimize /(X) = 2O.X2X3X4 + 20x 2 x 3 1 + 5x2X 2 
subject to 

5x^“ 5 xJ* < 1 
lOx^'xlx^ 1 < 1 


X; >0, i = 1 to 4 


8.12 
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Minimize /(X) = x { 2 + \x 2 xs 


subject to 


3 2-2 , 3 —2 . 

4 ^2 i g A2A3 1 

x\ >0, i = 1,2,3 


8.13 


Minimize /(X) = Yj 3 Y2 + y^ 2 y 3 1 


subject to 


x\x 2 1 + jx 1 2 x | < 1 

x\ > 0 , X 2 > 0 , x$ > 0 


8.14 


Minimize f = x 


-1 -2 -2 
1 *2 *3 


subject to 


Yj 3 + x\ + Y 3 < 1 

Y; >0, l = 1,2,3 


8.15 Prove that the function y = c\e a ' lXX + cie ® 2 * 2 + • • ■ + c n e a " Xn , c,- >0, f = 1,2,..., n, is 
a convex function with respect to x\, X 2 , . . . , y„. 

8.16 Prove that / = lnx is a concave function for positive values of x. 

8.17 The problem of minimum weight design of a helical torsional spring subject to a stress 
constraint can be expressed as [8.27] 


where d is the wire diameter, D the mean coil diameter, p the density, E is Young’s 
modulus, cf> the angular deflection in degrees, M the torsional moment, and Q the number 
of inactive turns. Solve this problem using geometric programming approach for the 
following data: E = 20 x 10 10 Pa, er max = 15 x 10 7 Pa, 4> = 20°, Q = 2, M = 0.3 N-m, 
and p = 7.7 x 10 4 N/m 3 . 

8.18 Solve the machining economics problem given by Eqs. (E2) and (E4) of Example 8.7 for 
the given data. 

8.19 Solve the machining economics problem given by Eqs. (E2), (E4), and (Eg) of Example 
8.7 for the given data. 

8.20 Determine the degree of difficulty of the problem stated in Example 8.8. 


Minimize f(d, D) 



subject to 


14.5M 
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B 



Live load, 
W per unit 
area 


Figure 8.5 Floor consisting of a plate with supporting beams [8.36]. 


8.21 A rectangular area of dimensions A and B is to be covered by steel plates with supporting 
beams as shown in Fig. 8.5. The problem of minimum cost design of the floor subject to a 
constraint on the maximum deflection of the floor under a specified uniformly distributed 
live load can be stated as [8.36] 


Minimize /(X) = cost of plates + cost of beams 

= kfY ABt + kbY Ak\nZ 2 ^ (1) 


subject to 


56.25WB 3 

EA 


t~\~ A + 


4.69 WBA 3 \ 
Eki /' 


-I Z _ 4 / 3 < 1 


( 2 ) 


where W is the live load on the floor per unit area, k y and kb are the unit costs of plates 
and beams, respectively, y the weight density of steel, t the thickness of plates, n the 
number of beams, k\Z 2 ^ the cross-sectional area of each beam, & 2 Z 4 / 3 the area moment 
of inertia of each beam, k\ and &2 are constants, Z the section modulus of each beam, 
and E the elastic modulus of steel. The two terms on the left side of Eq. (2) denote the 
contributions of steel plates and beams to the deflection of the floor. By assuming the data 
as A = 10 m, B = 50 m, IV = 1000 kg f /m 2 , k b = $0.05/ kg f , k f = $0.06/ kg f , y = 7850 
kgf/m 3 , E = 2.1 x 10 5 MN/m 2 , k\ = 0.78, and &2 = 1.95, determine the solution of the 
problem (i.e., the values of t*, n* , and Z*). 

8.22 Solve the zero-degree-of-difficulty bearing problem given by Eqs. (Es) and (Eg) of 
Example 8.12. 

8.23 Solve the one-degree-of-difficulty bearing problem given by Eqs. (E 12 ) and (E 13 ) of 
Example 8.12. 

8.24 The problem of minimum volume design of a statically determinate truss consisting of n 
members (bars) with m unsupported nodes and subject to q load conditions can be stated 
as follows [8.14]: 


n 

Minimize / = ^ /,x; 

;=i 


( 1 ) 
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subject to 


p(k) 

— i — < 1, i = 1,2 n, k = 1,2, ... ,q 

Xi a* ~ 

n 

V — ^ — -Sij < 1, j = 1, 2, . . . , m, k = 1, 2, . . . , q 

X:E A* 1 - J 


( 2 ) 

( 3 ) 


where F (k) is the tension in the ith member in the kih load condition, x , the cross-sectional 
area of member i, Z, the length of member i, E is Young’s modulus, a* the maximum 
permissible stress in member i, and A* the maximum allowable displacement of node j . 
Develop a suitable transformation technique and express the problem of Eqs. (1) to (3) 
as a geometric programming problem in terms of the design variables x,-. 
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9.1 INTRODUCTION 

In most practical problems, decisions have to be made sequentially at different points 
in time, at different points in space, and at different levels, say, for a component, for 
a subsystem, and/or for a system. The problems in which the decisions are to be made 
sequentially are called sequential decision problems. Since these decisions are to be 
made at a number of stages, they are also referred to as multistage decision problems. 
Dynamic programming is a mathematical technique well suited for the optimization of 
multistage decision problems. This technique was developed by Richard Bellman in 
the early 1950s [9.2, 9.6]. 

The dynamic programming technique, when applicable, represents or decomposes 
a multistage decision problem as a sequence of single-stage decision problems. Thus 
an /V-variablc problem is represented as a sequence of N single-variable problems that 
are solved successively. In most cases, these N subproblems are easier to solve than 
the original problem. The decomposition to N subproblems is done in such a manner 
that the optimal solution of the original /V-variablc problem can be obtained from 
the optimal solutions of the N one-dimensional problems. It is important to note that 
the particular optimization technique used for the optimization of the N single-variable 
problems is irrelevant. It may range from a simple enumeration process to a differential 
calculus or a nonlinear programming technique. 

Multistage decision problems can also be solved by direct application of the clas- 
sical optimization techniques. However, this requires the number of variables to be 
small, the functions involved to be continuous and continuously differentiable, and the 
optimum points not to lie at the boundary points. Further, the problem has to be rela- 
tively simple so that the set of resultant equations can be solved either analytically or 
numerically. The nonlinear programming techniques can be used to solve slightly more 
complicated multistage decision problems. But their application requires the variables 
to be continuous and prior knowledge about the region of the global minimum or max- 
imum. In all these cases, the introduction of stochastic variability makes the problem 
extremely complex and renders the problem unsolvable except by using some sort of an 
approximation such as chance constrained programming.^ Dynamic programming, on 
the other hand, can deal with discrete variables, nonconvex, noncontinuous, and nondif- 
ferentiable functions. In general, it can also take into account the stochastic variability 
by a simple modification of the deterministic procedure. The dynamic programming 


^The chance constrained programming is discussed in Chapter 11. 
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technique suffers from a major drawback, known as the curse of dimensionality . How- 
ever, despite this disadvantage, it is very suitable for the solution of a wide range of 
complex problems in several areas of decision making. 


9.2 MULTISTAGE DECISION PROCESSES 
9.2.1 Definition and Examples 

As applied to dynamic programming, a multistage decision process is one in which 
a number of single-stage processes are connected in series so that the output of one 
stage is the input of the succeeding stage. Strictly speaking, this type of process should 
be called a serial multistage decision process since the individual stages are connected 
head to tail with no recycle. Serial multistage decision problems arise in many types 
of practical problems. A few examples are given below and many others can be found 
in the literature. 

Consider a chemical process consisting of a heater, a reactor, and a distillation tower 
connected in series. The objective is to find the optimal value of temperature in the 
heater, the reaction rate in the reactor, and the number of trays in the distillation tower 
such that the cost of the process is minimum while satisfying all the restrictions placed 
on the process. Figure 9.1 shows a missile resting on a launch pad that is expected to 
hit a moving aircraft (target) in a given time interval. The target will naturally take 
evasive action and attempts to avoid being hit. The problem is to generate a set of 
commands to the missile so that it can hit the target in the specified time interval. 
This can be done by observing the target and, from its actions, generate periodically a 
new direction and speed for the missile. Next, consider the minimum cost design of a 
water tank. The system consists of a tank, a set of columns, and a foundation. Here the 
tank supports the water, the columns support the weights of water and tank, and the 
foundation supports the weights of water, tank, and columns. The components can be 


Target (jet plane) 



Figure 9.1 Ground-radar-controlled missile chasing a moving target. 
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seen to be in series and the system has to be treated as a multistage decision problem. 
Finally, consider the problem of loading a vessel with stocks of N items. Each unit 
of item i has a weight w, and a monetary value a. The maximum permissible cargo 
weight is W. It is required to determine the cargo load that corresponds to maximum 
monetary value without exceeding the limitation of the total cargo weight. Although the 
multistage nature of this problem is not directly evident, it can be posed as a multistage 
decision problem by considering each item of the cargo as a separate stage. 

9.2.2 Representation of a M ultistage Decision Process 

A single-stage decision process (which is a component of the multistage problem) can 
be represented as a rectangular block (Fig. 9.2). A decision process can be character- 
ized by certain input parameters, S (or data), certain decision variables (X), and certain 
output parameters (T) representing the outcome obtained as a result of making the 
decision. The input parameters are called input state variables, and the output param- 
eters are called output state variables. Finally, there is a return or objective function 
R, which measures the effectiveness of the decisions made and the output that results 
from these decisions. For a single-stage decision process shown in Fig. 9.2, the output 
is related to the input through a stage transformation function denoted by 

T = t(X , S) (9.1) 

Since the input state of the system influences the decisions we make, the return function 
can be represented as 

R = r(X , S) (9.2) 

A serial multistage decision process can be represented schematically as shown 
in Fig. 9.3. Because of some convenience, which will be seen later, the stages n, 
n — 1, . . . , i, . . . , 2, 1 are labeled in decreasing order. For the / th stage, the input state 
vector is denoted by S, + i and the output state vector as S,-. Since the system is a serial 
one, the output from stage i + 1 must be equal to the input to stage i. Hence the state 
transformation and return functions can be represented as 

s, =t,(s,+i,x,) (9.3) 

Ri =r i (S i+u X i ) (9.4) 


Input S 


Return, R = r(X, S) 

_J 

Stage 

transformation 

T = t(X, S) 

f 

Decision X 


■►Output T 


Figure 9.2 Single-stage decision problem. 
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Rn Rn - 1 Ri R - ! 



X n - 1 

Stage n Stage n~ 1 


*i 

Stage i 


*2 

Stage 2 


*1 

Stage 1 


Figure 9.3 Multistage decision problem (initial value problem). 


where X, denotes the vector of decision variables at stage i. The state transformation 
equations (9.3) are also called design equations. 

The objective of a multistage decision problem is to find X!,X 2 , ...,X„ so as 
to optimize some function of the individual statge returns, say, f(R \ , i? 2 , • ■ ■ , R n ) 
and satisfy Eqs. (9.3) and (9.4). The nature of the n-stagc return function, /, deter- 
mines whether a given multistage problem can be solved by dynamic programming. 
Since the method works as a decomposition technique, it requires the separability 
and monotonicity of the objective function. To have separability of the objective 
function, we must be able to represent the objective function as the composition 
of the individual stage returns. This requirement is satisfied for additive objective 
functions: 


f = J2 R i='Z2 R i<*’ Si+l) (9-5) 

(=1 1=1 

where X, are real, and for multiplicative objective functions, 

n n 

f =n« i =n«^^ (9 - 6) 

1=1 1=1 

where X, are real and nonnegative. On the other hand, the following objective function 
is not separable: 


/ = [i?i(Xi, S 2 ) + # 2 (X 2 , S 3 )][/?3 (X 3 , S 4 ) + /? 4 (X 4 , %)] (9.7) 

Fortunately, there are many practical problems that satisfy the separability condition. 
The objective function is said to be monotonic if for all values of a and b that 
make 

*i(Xi = a, Sf +1 ) > /?,(X, = b, s,- +1 ) 


the following inequality is satisfied: 

/(x„,x„_i, ...,X, + 1 ,X; = a, x,-_i, . . . , Xi, Sh+i) 

> /(x„,x, x /+ 1 ,x, = b,x i _ 1 ,...,x 1 ,Sh+i), i = l,2,...,n 


(9.8) 
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9.2.3 Conversion of a Nonserial System to a Serial System 

According to the definition, a serial system is one whose components (stages) are con- 
nected in such a way that the output of any component is the input of the succeeding 
component. As an example of a nonserial system, consider a steam power plant con- 
sisting of a pump, a feedwater heater, a boiler, a superheater, a steam turbine, and an 
electric generator, as shown in Fig. 9.4. If we assume that some steam is taken from the 
turbine to heat the feedwater, a loop will be formed as shown in Fig. 9.4a. This nonserial 
system can be converted to an equivalent serial system by regrouping the components 
so that a loop is redefined as a single element as shown in Fig. 9.4b and c. Thus the 
new serial multistage system consists of only three components: the pump, the boiler 
and turbine system, and the electric generator. This procedure can easily be extended 
to convert multistage systems with more than one loop to equivalent serial systems. 

9.2.4 Types of M ultistage Decision Problems 

The serial multistage decision problems can be classified into three categories as 
follows. 

1. Initial value problem. If the value of the initial state variable, S, /+ i, is prescribed, 
the problem is called an initial value problem. 

2. Final value problem. If the value of the final state variable, Si is prescribed, the 
problem is called a final value problem . Notice that a final value problem can 
be transformed into an initial value problem by reversing the directions of S,-, 
i = 1, 2, ...,« + 1. The details of this are given in Section 9.7. 



(a) 



( 6 ) 



(c) 


Figure 9.4 Serializing a nonserial system. 
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(a) 



( b ) 



(c) 

Figure 9.5 Types of multistage problems: (a) initial value problem; ( b ) final value problem; 
(c) boundary value problem. 


3. Boundary value problem. If the values of both the input and output variables 
are specified, the problem is called a boundary value problem. The three types 
of problems are shown schematically in Fig. 9.5, where the symbol +> is used 
to indicate a prescribed state variable. 


9.3 CONCEPT OF SUBOPTIMIZATION AND PRINCIPLE 
OF OPTIMALITY 

A dynamic programming problem can be stated as follows/ Find x\, x 2 , ■ ■ ■ , x n , which 
optimizes 

n n 

fix 1 ,X 2 , ..., X n ) = ^ Rj = ^ n 0; +1 , Xj) 
i = 1 i=l 


and satisfies the design equations 

Si = ti(s i+1 ,Xi), i = 1,2 ,...,« 

The dynamic programming makes use of the concept of suboptimization and the prin- 
ciple of optimality in solving this problem. The concept of suboptimization and the 
principle of optimality will be explained through the following example of an initial 
value problem. 


tin the subsequent discussion, the design variables x , and state variables Sj are denoted as scalars for 
simplicity, although the theory is equally applicable even if they are vectors. 
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m 


Water tank to carry 100,000 liters 
of water (rectangular or circular) 


Columns (RCC or steel) 


Foundation (Mat or pile) 


(a) 



( 6 ) 


Figure 9.6 Water tank system. 


Example 9.1 Explain the concept of suboptimization in the context of the design of 
the water tank shown in Fig. 9.6 a. The tank is required to have a capacity of 100,000 
liters of water and is to be designed for minimum cost [9.10]. 

SOLUTION Instead of trying to optimize the complete system as a single unit, it 
would be desirable to break the system into components which could be optimized 
more or less individually. For this breaking and component suboptimization, a logical 
procedure is to be used; otherwise, the procedure might result in a poor solution. This 
concept can be seen by breaking the system into three components: component i (tank), 
component j (columns), and component k (foundation). Consider the suboptimization 
of component j (columns) without a consideration of the other components. If the cost 
of steel is very high, the minimum cost design of component j may correspond to 
heavy concrete columns without reinforcement. Although this design may be accept- 
able for columns, the entire weight of the columns has to be carried by the foundation. 
This may result in a foundation that is prohibitively expensive. This shows that the 
suboptimization of component j has adversely influenced the design of the following 
component k. This example shows that the design of any interior component affects the 
designs of all the subsequent (downstream) components. As such, it cannot be subop- 
timized without considering its effect on the downstream components. The following 
mode of suboptimization can be adopted as a rational optimization strategy. Since the 
last component in a serial system influences no other component, it can be suboptimized 
independently. Then the last two components can be considered together as a single 
(larger) component and can be suboptimized without adversely influencing any of the 
downstream components. This process can be continued to group any number of end 
components as a single (larger) end component and suboptimize them. This process of 
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Original system 



Suboptimize component i 



Suboptimize components./' and i 



Suboptimize components kj and i (complete system) 


Figure 9.7 Suboptimization (principle of optimality). 


suboptimization is shown in Fig. 9.7. Since the suboptimizations are to be done in the 
reverse order, the components of the system are also numbered in the same manner for 
convenience (see Fig. 9.3). 

The process of suboptimization was stated by Bellman [9.2] as the principle of 
optimality: 

An optimal policy (or a set of decisions) has the property that whatever the 
initial state and initial decision are, the remaining decisions must constitute 
an optimal policy with regard to the state resulting from the first decision. 

Recurrence Relationship. Suppose that the desired objective is to minimize the 
//-stage objective function /, which is given by the sum of the individual stage returns: 


Minimize / = R„(x„, s„+i) + R n -i(x„-i, s„) H (- /?i(*t, s 2 ) (9.9) 

where the state and decision variables are related as 


Si = ti(s i+ i, Xi), i = 1,2, ... ,n 


(9.10) 
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Consider the first subproblem by starting at the final stage, i = 1. If the input to this 
stage S 2 is specified, then according to the principle of optimality, x\ must be selected 
to optimize R\. Irrespective of what happens to the other stages, x\ must be selected 
such that .S' 2 ) is an optimum for the input s 2 . If the optimum is denoted as /*, 

we have 


f*(s 2 ) = opt[R ] (x l ,s 2 )\ (9.11) 

*1 

This is called a one-stage policy since once the input state s 2 is specified, the optimal 
values of R\, x\, and ,V| are completely defined. Thus Eq. (9.1 1) is a parametric equation 
giving the optimum /* as a function of the input parameter s 2 . 

Next, consider the second subproblem by grouping the last two stages together. 
If f* denotes the optimum objective value of the second subproblem for a specified 
value of the input S 3 , we have 

/ 2 *fe) = opt [R 2 (x 2 , s 3 ) + *i(xi, s 2 )] (9.12) 

The principle of optimality requires that x\ be selected so as to optimize R 1 for a given 
S 2 . Since s 2 can be obtained once x 2 and S 3 are specified, Eq. (9.12) can be written 
as 


/ 2 *(s 3 ) = opt[7? 2 (jc 2 , s 3 ) + f*(s 2 )] (9.13) 

X2 

Thus / 2 represents the optimal policy for the two-stage subproblem. It can be seen 
that the principle of optimality reduced the dimensionality of the problem from two 
[in Eq. (9.12)] to one [in Eq. (9.13)]. This can be seen more clearly by rewriting 
Eq. (9.13) using Eq. (9.10) as 

/ 2 *(s 3 ) = opt [R 2 (x 2 , s 3 ) + f*{t 2 (x 2 , s 3 )}] (9.14) 

x 2 

In this form it can be seen that for a specified input s 3 , the optimum is determined solely 
by a suitable choice of the decision variable x 2 . Thus the optimization problem stated 
in Eq. (9.12), in which both x 2 and x\ are to be simultaneously varied to produce the 
optimum f*, is reduced to two subproblems defined by Eqs. (9.11) and (9.13). Since 
the optimization of each of these subproblems involves only a single decision variable, 
the optimization is, in general, much simpler. 

This idea can be generalized and the /' th subproblem defined by 

f t *(si+ 1 ) = opt [Ri(xi, Si+\) + Ri-i(xi-\, Si) H 1 - /?1 (* 1 , 5 - 2 )] (9.15) 

..,*1 

which can be written as 


f*(s i+ 1 ) = opt [Riixi^i+i) + /,*_,(.?,)] (9.16) 

Xi 

where /*_ l denotes the optimal value of the objective function corresponding to the last 
i — 1 stages, and s t is the input to the stage i — 1. The original problem in Eq. (9.15) 
requires the simultaneous variation of i decision variables, x\, x 2 , . . . , jc;, to determine 
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the optimum value of fj = Yl'k=\ Rk for any specified value of the input .v (+ i. This 
problem, by using the principle of optimality, has been decomposed into i separate 
problems, each involving only one decision variable. Equation (9.16) is the desired 
recurrence relationship valid for i = 2, 3, . . . , n. 


9.4 COMPUTATIONAL PROCEDURE IN DYNAMIC 
PROGRAMMING 

The use of the recurrence relationship derived in Section 9.3 in actual computations is 
discussed in this section [9.10]. As stated, dynamic programming begins by subopti- 
mizing the last component, numbered 1 . This involves the determination of 

/i*(s 2 ) = opt[/?i(*i,s 2 )] (9.17) 

*i 

The best value of the decision variable x \ , denoted as x*, is that which makes the 
return (or objective) function R\ assume its optimum value, denoted by /*. Both x* 
and /* depend on the condition of the input or feed that the component 1 receives from 
the upstream, that is, on s 2 . Since the particular value s 2 will assume after the upstream 
components are optimized is not known at this time, this last-stage suboptimization 
problem is solved for a “range” of possible values of .v 2 and the results are entered 
into a graph or a table. This graph or table contains a complete summary of the results 
of suboptimization of stage 1. In some cases, it may be possible to express f* as a 
function of so- If the calculations are to be performed on a computer, the results of 
suboptimization have to be stored in the form of a table in the computer. Figure 9.8 
shows a typical table in which the results obtained from the suboptimization of 
stage 1 are entered. 

Next we move up the serial system to include the last two components. In this 
two-stage suboptimization, we have to determine 

/ 2 *fe) = opt [R 2 (x 2 , s 2 ) + Ri(x u s 2 )] (9.18) 

x 2- x \ 

Since all the information about component 1 has already been encoded in the table 
corresponding to /*, this information can then be substituted for A’| in Eq. (9.18) to 


S2 


R i (x\, s 2 ) 



Si 


(a) 



( b ) Summary of stage 1 


Figure 9.8 Suboptimization of component 1 for various settings of the input state variable s 2 . 
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get the following simplified statement: 

fi O 3 ) = opt[fl 2 (x 2 , 53 ) + It(s 2 )\ (9.19) 

*2 


Thus the number of variables to be considered has been reduced from two ( x\ and x 2 ) 
to one (x 2 ). A range of possible values of .S 3 must be considered and for each one, x| 
must be found so as to optimize [ R 2 + /*(s 2 )]. The results (x| and / 2 * for different 
S 3 ) of this suboptimization are entered in a table as shown in Fig. 9.9. 



si 


x 2 


Xl 


Opt {R 2 + f\ (s 2 )l = 
x 2 


(a) 



For each setting of s 3 , draw 
a graph as shown above to 
obtain the following: 


S 3 

x 2 * 

h* 

s 2 

— 

— 

— 

— 


{b) Summary of stages 2 and 1 


Figure 9.9 Suboptimization of components 1 and 2 for various settings of the input state 
variable S 3 . 
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Assuming that the suboptimization sequence has been carried on to include i — 1 
of the end components, the next step will be to suboptimize the / end components. 
This requires the solution of 


/><■+!)= opt [Rt+K-i + ■■• + *!] (9.20) 

However, again, all the information regarding the suboptimization of i — 1 end com- 
ponents is known and has been entered in the table corresponding to f*_ ] . Hence this 
information can be substituted in Eq. (9.20) to obtain 

f*(si+ 1 ) = opt[fl, ■(*,■, s i+ i) + (9.21) 

Xi 

Thus the dimensionality of the /-stage suboptimization has been reduced to 1, and the 
equation Sj — // (.v, + 1 , Xj ) provides the functional relation between Xj and s, . As before, 
a range of values of Si+i are to be considered, and for each one, x* is to be found so 
as to optimize [ R, + f*_ } J . A table showing the values of x* and f* for each of the 
values of s/+i is made as shown in Fig. 9.10. 

The suboptimization procedure above is continued until stage n is reached. 
At this stage only one value of s n+ \ needs to be considered (for initial value 
problems), and the optimization of the n components completes the solution of the 
problem. 

The final thing needed is to retrace the steps through the tables generated, to gather 
the complete set of x* (/ = 1.2,..., n ) for the system. This can be done as follows. 
The nth suboptimization gives the values of x* and f* for the specified value of .s> H i 
(for initial value problem). The known design equation s n — t n (s n+ \ , x*) can be used 
to find the input, s*, to the (n — l)th stage. From the tabulated results for /* , (s n ), the 
optimum values f*_ x and x*_ { corresponding to s* can readily be obtained. Again the 
known design equation ,s„_i = t n _ i (s n . x*_ { ) can be used to find the input, s *_ , , to the 
(n — 2)th stage. As before, from the tabulated results of f*_r>(s n - 1 ), the optimal values 
x*_ 2 and f*_ 2 corresponding to s*_ t can be found. This procedure is continued until 
the values x* and /* corresponding to ,v| are obtained. Then the optimum solution 
vector of the original problem is given by (x*, x|, . . . , x* ) and the optimum value of 
the objective function by f*. 


9.5 EXAMPLE ILLUSTRATING THE CALCULUS METHOD 
OF SOLUTION 

Example 9.2 The four-bar truss shown in Fig. 9.11 is subjected to a vertical load of 
2 x 10 s lb at joint A as shown. Determine the cross-sectional areas of the members 
(bars) such that the total weight of the truss is minimum and the vertical deflection 
of joint A is equal to 0.5 in. Assume the unit weight as 0.01 lb/in ' and the Young’s 
modulus as 20 x 10 6 psi. 

SOFUTION Fet x t denote the area of cross section of member /(/ = 1, 2, 3, 4). The 
lengths of members are given by l\ = Ij, = 100 in., h = 120 in., and U — 60 in. The 
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° pt [«,:+/;*- ! M = /?(s i + i) 

x i 

For each setting of s t + lt consider a graph as shown below: 

(a) 


R, +fi- l tei) 
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X 
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And obtain the following 


s i + 1 

Xi 
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— 
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— 
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— 
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— 

— 


(6) Summary of stages i, i-1 , ...2, and 1 


Figure 9.10 Suboptimization of components 1,2,...,/ for various settings of the input state 
variable .s, +l . 


weight of the truss is given by 

/( JCi , X 2 , X3, Xi) — 0.01(100x1 + 120x2 + 100x3 + 6OX4) 

= xi + 1.2x2 +X3 + 0.6x4 (Ei) 

From structural analysis [9.5], the force developed in member i due to a unit load acting 
at joint A(pi), the deformation of member i (d), and the contribution of member i to 
the vertical deflection of A (S, = /?,■ d, ) can be determined as follows: 
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A 

10 5 lb 



60 in. 

Figure 9.11 Four-bar truss. 


Member i 

Pi 

(stress,)/; Pptli 

di= ,, „ (m.) 

t Xih 

Sj = pidi (in.) 

1 

-1.25 

-1.25/xi 

1.5625/xi 

2 

0.75 

0.9/x 2 

0.6750/x 2 

3 

1.25 

1.25/X3 

1.5625/X3 

4 

-1.50 

— O. 9 /X 4 

1.3500/X4 



The vertical deflection of joint A is given by 

1.5625 0.6750 1.5625 1.3500 

Si = + + + (Ea) 

X\ X2 Xt, X4 

Thus the optimization problem can be stated as 

Minimize /(X) = x\ + 1 .2x2 + X3 + O.6.X4 


d A = J2 

i = 1 


subject to 

1.5625 0.6750 1.5625 1.3500 _ 

+ + + = 0.5 (E 3 ) 

X\ X2 X3 X4 

xt >0, X2 > 0, x 3 > 0, X4 > 0 


Since the deflection of joint A is the sum of contributions of the various members, 
we can consider the 0.5 in. deflection as a resource to be allocated to the various 
activities x; and the problem can be posed as a multistage decision problem as shown 
in Fig. 9.12. Let S 2 be the displacement (resource) available for allocation to the first 
member (stage 1), <5j the displacement contribution due to the first member, and f*(s 2 ) 
the minimum weight of the first member. Then 

* 1.5625 

/1 O2) = min[/?i = xi] = (E 4 ) 

S2 


such that 


1.5625 


<5i = 


Xl 


and xi > 0 
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X4 


x 3 


x 2 


x\ 


Figure 9.12 Example 9.2 as a four-stage decision problem. 


since <$1 = s 2 , and 


1 .5625 


x, = 


S2 


(Hs) 


Let 53 be the displacement available for allocation to the first two members, 5 2 the 
displacement contribution due to the second member, and / 2 * (s 3 ) the minimum weight 
of the first two members. Then we have, from the recurrence relationship of Eq. (9. 16), 


/ 2 * 0 3) = min [R 2 + f*(s 2 )\ 


(e 6 ) 


where s 2 represents the resource available after allocation to stage 2 and is given by 

0.6750 


s 2 = s 3 - S 2 - S3 - 
Hence from Eq. (E4), we have 


X2 


0.6750\ 

f\ (S 2 ) = /l ( S3 I 


*2 


1.5625 


0.6750 V 
*2 / 


Thus Eq. (Eg) becomes 


f* (s 3 ) = min 

X2 >0 


1 .2x 2 + 


1.5625 


S3 — 0.6750/x 2 _ 


Let 


F (s 3 , x 2 ) = 1.2x 2 + 


1.5625 


— 1.2x 2 + 


1.5625x2 


S 3 — 0.6750/x 2 ” S 3 X 2 — 0.6750 

For any specified value of S3, the minimum of F is given by 

dF (1.5625X0.6750) * 1.6124 

= 1.2 =0 or x 9 = 

dx 2 (S 3 X 2 — 0.6750) 2 s 3 


/ 2 *(s 3 ) = 1.2x| + 


1.5625 


s 3 - 0.6750/x| 


1.9349 2.6820 4.6169 

+ = 


S3 


S3 


S3 


(E 7 ) 


(Eg) 


(Eg) 

(E10) 
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Let 54 be the displacement available for allocation to the first three members. Let 
<?3 be the displacement contribution due to the third member and f* (34) the minimum 
weight of the first three members. Then 


f* (s 4 ) = min [x 3 + / 2 * ( 53 )] 

where 53 is the resource available after allocation to stage 3 and is given by 

1.5625 


(E11) 


53 = 54 — S3 = 54 — 


*3 


From Eq. (E 10 ) we have 


/ 2 *fe) = 


4.6169 


and Eq. (En) can be written as 


f 3 * (s 4 ) = min 

* 3>0 


54 — 1. 5625/x 3 
4.6169x3 


X3 + 


As before, by letting 


F(s 4 , x 3 ) = x 3 + 


54X3 — 1.5625 
4.6169x3 


s 4 X3 — 1.5625 

the minimum of F, for any specified value of s 4 , can be obtained as 


dF 

9x 3 


1 . 0 - 

f 3 *(s 4 ) = x| + 


(4.6169)(1.5625) 

(54X3 — 1.5625) 2 

4.2445 7.3151 


0 or X3 = 


4.2445 


4.6169x3 
S 4 JE 3 - E5625 


54 


54 


S 4 

11.5596 

54 


(E12) 

(E13) 

(E14) 

(E15) 

(Eie) 


Finally, let 55 denote the displacement available for allocation to the first four 
members. If S 4 denotes the displacement contribution due to the fourth member, and 
f 4 (55 ) the minimum weight of the first four members, then 


f 4 (s 5 ) = min [0.6x 4 + / 3 *(s 4 )] 

* 4>0 


(En) 


where the resource available after allocation to the fourth member ( 54 ) is given by 

1.3500 


54 = S 5 - <5 4 = 55 

From Eqs. (Em), (En), and (Em), we obtain 
/ 4 * (55) = min 0.6x4 + 

* 4>0 

By setting 


X4 


11.5596 


F(s$,x 4) = 0.6x4 + 


55 - 1.3500/x 4 _ 

11.5596 
55 — 1.3500/X4 


(Em) 


(E19) 
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the minimum of F(s$, X4), for any specified value of s 5, is given by 


dF (11 .5596)(1 .3500) * 6.44 

= 0.6 t- = 0 or x 4 = 

3^4 ($5x4 — 1.3500) 2 55 


(E 20 ) 


/ 4 *fc) = 0.6x4 + 


11.5596 


s 5 - 1.3500/x* 


3.864 16.492 20.356 

+ = (E21) 


S5 


*5 


S5 


Since the value of 55 is specified as 0.5 in., the minimum weight of the structure can 
be calculated from Eq. (E21) as 


* 20.356 

/ 4 *fe= 0.5) = -^-= 40.7121b 


(E22) 


Once the optimum value of the objective function is found, the optimum values of the 
design variables can be found with the help of Eqs. (E20), (E15), (Eg), and (Eg) as 


x 4 = 12.88 in 


2 

1.3500 


■^4 — *^5 


0.5 — 0.105 = 0.395 in. 


x; 


4.2445 


= 


= 10.73 hr 


■U 


1 .5625 

53 =s 4 — = 0.3950 - 0.1456 = 0.2494 in. 


1.6124 


x 9 = 


= 6.47 in" 


S3 


0.6750 

si — S 3 = 0.2494 - 0.1042 = 0.1452 in. 


x, = 


1.5625 

S2 


10.76 in- 


9.6 EXAMPLE ILLUSTRATING THE TABULAR METHOD 
OF SOLUTION 

Example 9.3 Design the most economical reinforced cement concrete (RCC) water 
tank (Fig. 9.6a) to store 100,000 liters of water. The structural system consists of a 
tank, four columns each 10 m high, and a foundation to transfer all loads safely to the 
ground [9.10]. The design involves the selection of the most appropriate types of tank, 
columns, and foundation among the seven types of tanks, three types of columns, and 
three types of foundations available. The data on the various types of tanks, columns, 
and foundations are given in Tables 9.1, 9.2, and 9.3, respectively. 

SOLUTION The structural system can be represented as a multistage decision pro- 
cess as shown in Fig. 9.13. The decision variables x\, xi, and X3 represent the type of 
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Table 9.1 Component 3 (Tank) 


Type of tank 

Load acting on 
the tank, S 4 
(kgf) 

Ri cost ($) 

Self- weight of 
the component 
(kgf) 

S 3 = 54 + 
self-weight 
(kgf) 

(a) Cylindrical RCC tank 

100,000 

5,000 

45,000 

145,000 

(b) Spherical RCC tank 

100,000 

8,000 

30,000 

130,000 

(c) Rectangular RCC tank 

100,000 

6,000 

25,000 

125,000 

(d) Cylindrical steel tank 

100,000 

9,000 

15,000 

115,000 

(e) Spherical steel tank 

100,000 

15,000 

5,000 

105,000 

(f) Rectangular steel tank 

100,000 

12,000 

10,000 

110,000 

(g) Cylindrical RCC tank with 
hemispherical RCC dome 

100,000 

10,000 

15,000 

115,000 


Table 9.2 Component 2 (Columns) 




Type of columns 

S 3 (kgf) 

R 2 cost ($) 

Self-weight (kgf) 

S2 = S3 + 

self-weight (kgf) 

(a) RCC columns 

150,000 

6,000 

70,000 

220,000 


130,000 

5,000 

50,000 

180,000 


110,000 

4,000 

40,000 

150,000 


100,000 

3,000 

40,000 

140,000 

(b) Concrete columns 

150,000 

8,000 

60,000 

210,000 


130,000 

6,000 

50,000 

180,000 


110,000 

4,000 

30,000 

140,000 


100,000 

3,000 

15,000 

115,000 

(c) Steel columns 

150,000 

15,000 

30,000 

180,000 


130,000 

10,000 

20,000 

150,000 


110,000 

9,000 

15,000 

125,000 


100,000 

8,000 

10,000 

1 10,000 


foundation, columns, and the tank used in the system, respectively. Thus the vari- 
able x | can take three discrete values, each corresponding to a particular type of 
foundation (among mat, concrete pile, and steel pile types). Similarly the variable 
X 2 is assumed to take three discrete values, each corresponding to one of the columns 
(out of RCC columns, concrete columns, and steel columns). Finally, the variable 
can take seven discrete values, each corresponding to a particular type of tank listed 
in Table 9.1. 

Since the input load, that is, the weight of water, is known to be 100,000 kgf, 54 
is fixed and the problem can be considered as an initial value problem. We assume 
that the theories of structural analysis and design in the various materials provide the 
design equations 


Si = ti(Xi,S i + 1 ) 
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Table 9.3 Component 1 (Foundation) 


Type of foundation 

S2 (kgf) 

Ri cost ($) 

Self-weight (kgf) 

Sl = + 

self-weight (kgf) 

(a) Mat foundation 

220,000 

5,000 

60,000 

280,000 


200,000 

4,000 

45,000 

245,000 


180,000 

3,000 

35,000 

215,000 


140,000 

2,500 

25,000 

165,000 


100,000 

500 

20,000 

120,000 

(b) Concrete pile foundation 

220,000 

3,500 

55,000 

275,000 


200,000 

3,000 

40,000 

240,000 


180,000 

2,500 

30,000 

210,000 


140,000 

1,500 

20,000 

160,000 


100,000 

1,000 

15,000 

115,000 

(c) Steel pile foundation 

220,000 

3,000 

10,000 

230,000 


200,000 

2,500 

9,000 

209,000 


180,000 

2,000 

8,000 

188,000 


140,000 

2,000 

6,000 

146,000 


100,000 

1,500 

5,000 

105,000 


^3 Rz R 



x 3 


x 2 


X 1 


Figure 9.13 Example 9.3 as a three-stage decision problem. 


which yield information for the various system components as shown in Tables 9.1 
to 9.3 (these values are given only for illustrative purpose). 

Suboptimization of Stage 1 (Component 1) 

For the suboptimization of stage 1, we isolate component 1 as shown in Fig. 9.14r; 
and minimize its cost /?i (jci , s 2 ) for any specified value of the input state S 2 to obtain 

f*(sz) as 

f*(s 2 ) = min [/?i(xi,s 2 )] 

Since five settings of the input state variable s 2 are given in Table 9.3, we obtain /* 
for each of these values as shown below: 
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S3 


R i 


82' 






1 


► Si 



k 

XI 

f{ (82) 


xi 


(a) 


R 2 


R 1 



si 


x 2 


xi 


f~ (S3) = min [R 2 + R il = min \R 2 + f[ (s 2 )] 


X2 


x 2 


«3 


( 6 ) 

«2 


*1 


s 4 




x 3 


x 2 


x\ 



■ 81 


A* (s 4 ) = min [R 2 + R 2 + -Ril = min LR3 + f* (S3)] 
J X3 x 3 


(c) 

Figure 9.14 Various stages of suboptimization of Example 9.3: ( a ) suboptimization of com- 
ponent 1; ( b ) suboptimization of components 1 and 2; (c) suboptimization of components 1, 2, 
and 3. 
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Specific value 
of s 2 (kgf) 

x* (type of foundation 
for minimum cost) 

/r 

($) 

Corresponding value 
of si (kgf) 

220,000 

(c) 

3,000 

230,000 

200,000 

(c) 

2,500 

209,000 

180,000 

(c) 

2,000 

188,000 

140,000 

(b) 

1,500 

160,000 

100,000 

(a) 

500 

120,000 


Suboptimization of Stages 2 and 1 (Components 2 and 1) 

Here we combine components 2 and 1 as shown in Fig. 9.14b and minimize 
the cost (R 2 + R\) for any specified value 53 to obtain f* (.S3 ) as 

/ 2 * 0 3) = min [R 2 (x 2 , s 3 ) + Ri(x u s 2 )] = min [R 2 (x 2 , s 3 ) + f*(s 2 )] 

* 2,*1 X 2 

Since four settings of the input state variable 53 are given in Table 9.2, we can find / 2 * 
for each of these four values. Since this number of settings for .S 3 is small, the values of 
the output state variable s 2 that result will not necessarily coincide with the values of 
s 2 tabulated in Table 9.3. Hence we interpolate linearly the values of s 2 (if it becomes 
necessary) for the purpose of present computation. However, if the computations are 
done on a computer, more settings, more closely spaced, can be considered without 
much difficulty. The suboptimization of stages 2 and 1 gives the following results: 


Specific 
value of 
S 3 (kgf) 

Value of 
x 2 (type 
of columns) 

Cost of 
columns, 
Ri ($) 

Value of 
the 

output 
state 
variable 
S2 (kgf) 

x* (Type of 
foundation) 

f* 

($) 

Ri + f* 
($) 

150,000 

(a) 

6,000 

220,000 

(c) 

3,000 

9,000 


(b) 

8,000 

210,000 

(c) 

2,750" 

10,750 


(c) 

15,000 

180,000 

(c) 

2,000 

17,000 

130,000 

(a) 

5,000 

180,000 

(C) 

2,000 

7,000 


(b) 

6,000 

180,000 

(C) 

2,000 

8,000 


(c) 

10,000 

150,000 

(b) 

1,625" 

11,625 

1 10,000 

(a) 

4,000 

150,000 

(b) 

1,625" 

5,625 


(b) 

4,000 

140,000 

(b) 

1,500 

5,500 


(c) 

9,000 

125,000 

(b) 

1,125" 

10,125 

100,000 

(a) 

3,000 

140,000 

(b) 

1,500 

4,500 


(b) 

3,000 

115,000 

(a) 

875" 

3,875 


(c) 

8,000 

110,000 

(a) 

750" 

8,750 


Notice that the double-starred quantities indicate interpolated values and the boxed 
quantities the minimum cost solution for the specified value of 53 . Now the desired 
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quantities (i.e., / 2 and x* *) corresponding to the various discrete values of S3 can be 
summarized as follows: 


Specified value of 
53 (kgf) 

Type of columns 
corresponding to 
minimum cost of 
stages 2 and 1, (x|) 

Minimum cost of 
stages 2 and 1, / 2 * 
($) 

Value of the 
corresponding 
state variable, s 3 
(kgf) 

150,000 

(a) 

9,000 

220,000 

130,000 

(a) 

7,000 

180,000 

110,000 

(b) 

5,500 

140,000 

100,000 

(b) 

3,875 

115,000 


Suboptimization of Stages 3, 2, and 1 (Components 3, 2, and 1) 

For the suboptimization of stages 3, 2, and 1, we consider all three compo- 
nents together as shown in Fig. 9.14c and minimize the cost (/? 3 + Ri + R\) for any 
specified value of s 4 to obtain /* (,v 4 ). Flowever, since there is only one value of s 4 
(initial value problem) to be considered, we obtain the following results by using the 
information given in Table 9.1: 


/ 3 *(m) = min [^ 3 (x3,s 4 ) + f?2(^2,53) + ^i(^i^2)] 

X 3 ,X 2 ,Xl 

= min [R 3 (x 3 , s 4 ) + / 2 *(s 3 )] 

x 3 


x* (type of 


Specific 
value of 
54 (kgf) 

Type of 
tank (jc 3 ) 

Cost of 
tank R ; 
($) 

Corresponding 
output state, 53 
(kgf) 

columns for 
minimum 
cost) 

f* ($) R 3 + / 2 * ($) 

100,000 

(a) 

5,000 

145,000 

(a) 

8,500** 

13,500 



(b) 

8,000 

130,000 

(a) 

7,000 

15,000 



(c) 

6,000 

125,000 

(a) 

6,625** 

12,625 



(d) 

9,000 

115,000 

(b) 

5,875** 

14,875 



(e) 

15,000 

105,000 

(b) 

4,687i** 

19,6874 


(f) 

12,000 

110,000 

(b) 

5,500 

17,500 



(g) 

10,000 

115,000 

(b) 

5,875** 

15,875 



Here also the double-starred quantities indicate the interpolated values and the boxed 
quantity the minimum cost solution. From the results above, the minimum cost solution 
is given by 

,s 4 = 100,000 kg f 

*3 = type (c) tank 
/ 3 *(,s' 4 = 100,000) = $12,625 

i 3 = 125, 000 kg f 
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Now, we retrace the steps to collect the optimum values of x%, and x* and obtain 

xf = type (c) tank, S 3 = 125,000 kg f 

x\ — type (a) columns, Ay = 170,000 kg f 

x* — type (c) foundation, A| = 181,000 kg f 

and the total minimum cost of the water tank is $12,625. Thus the minimum cost water 
tank consists of a rectangular RCC tank, RCC columns, and a steel pile foundation. 


9.7 CONVERSION OF A FINAL VALUE PROBLEM INTO 
AN INITIAL VALUE PROBLEM 

In previous sections the dynamic programming technique has been described with 
reference to an initial value problem. If the problem is a final value problem as shown 
in Fig. 9.15a, it can be solved by converting it into an equivalent initial value problem. 
Let the stage transformation (design) equation be given by 

Si = tj ( a 1+ , , Xi), i = 1,2, ... ,n (9.22) 

Assuming that the inverse relations exist, we can write Eqs. (9.22) as 

Si+i = ti ( Si ,Xi), i = 1,2, ... ,n (9.23) 

where the input state to stage i is expressed as a function of its output state and the 
decision variable. It can be noticed that the roles of input and output state variables 
are interchanged in Eqs. (9.22) and (9.23). The procedure of obtaining Eq. (9.23) from 
Eq. (9.22) is called state inversion. If the return (objective) function of stage i is 
originally expressed as 


Ri = r t ( a, + i , x^, i — 1,2, ... ,n (9.24) 

Eq. (9.23) can be used to express it in terms of the output state and the decision 
variable as 

Ri = r i \Ji (si , Xj), x^ = 7i(si,Xi), i = 1, 2, . . . , n (9.25) 

The optimization problem can now be stated as follows: 

Find X] , X 2 , . ■ ■ ,x n so that 

n n 

fix 1 ,x 2 ,..., X n ) = Ri = XI 7 ‘ ( s ‘ ’ x ‘) (9-26) 

1=1 1=1 

will be optimum where the Sj are related by Eq. (9.23). 

The use of Eq. (9.23) amounts to reversing the direction of the flow of information 
through the state variables. Thus the optimization process can be started at stage n and 
stages n — 1, n — 2, .... 1 can be reached in a sequential manner. Since A| is specified 
(fixed) in the original problem, the problem stated in Eq. (9.26) wifi be equivalent to 
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(a) 



(b) 


Figure 9.15 Conversion of a final value problem to an initial value problem: (a) final value 
problem; ( b ) initial value problem. 

an initial value problem as shown in Fig. 9A5b. This initial value problem is identical 
to the one considered in Fig. 9.3 except for the stage numbers. If the stage numbers 
1,2, ... ,n are reversed to n,n — 1, . . . , 1, Fig. 9.15h will become identical to Fig. 9.3. 
Once this is done, the solution technique described earlier can be applied for solving 
the final value problem shown in Fig. 9.15o. 

Example 9.4 A small machine tool manufacturing company entered into a contract 
to supply 80 drilling machines at the end of the first month and 120 at the end of the 
second month. The unit cost of manufacturing a drilling machine in any month is given 
by $(50x + 0.2x 2 ), where x denotes the number of drilling machines manufactured in 
that month. If the company manufactures more units than needed in the first month, 
there is an inventory carrying cost of $8 for each unit carried to the next month. Find 
the number of drilling machines to be manufactured in each month to minimize the 
total cost. Assume that the company has enough facilities to manufacture up to 200 
drilling machines per month and that there is no initial inventory. Solve the problem 
as a final value problem. 

SOLUTION The problem can be stated as follows: 

Minimize f(x i, X 2 ) = (50xi + 0.2x 2 ) + (50x2 + 0.2xf) + 8(xi — 80) 


xi > 80 


subject to 


x\ + X 2 = 200 
Xi > 0, X2 > 0 
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where x\ and x 2 indicate the number of drilling machines manufactured in the first 
month and the second month, respectively. To solve this problem as a final value 
problem, we start from the second month and go backward. If I 2 is the inventory at 
the beginning of the second month, the optimum number of drilling machines to be 
manufactured in the second month is given by 

* 2 * = 120-/2 (Ei) 


and the cost incurred in the second month by 

R 2 (xl, h) = 8/ 2 + 50x| + 0.2 xf 
By using Eq. (Ei), R 2 can be expressed as 

Riih) = 8/2 + 50(120 - I 2 ) + 0.2(120 - I 2 ) 2 = 0.2/ 2 2 - 90/ 2 + 8880 (E 2 ) 

Since the inventory at the beginning of the first month is zero, the cost involved in the 
first month is given by 

R\ (xi) = 50xi + 0.2x 2 
Thus the total cost involved is given by 

/ 2 (/ 2 , x,) = (50x! + 0.2x 2 ) + (0.2/ 2 2 - 90/ 2 + 8880) (E 3 ) 

But the inventory at the beginning of the second month is related to xi as 

I 2 =x 1 - 80 (E 4 ) 


Equations (E 3 ) and (E4) lead to 

/ = / 2 (/ 2 ) = (50*! + 0.2x 2 ) + 0.2(xj - 80) 2 - 90(xi - 80) + 8880 
= 0.4x 2 - 72xi + 17,360 (E 5 ) 

Since / is a function of xi only, the optimum value of x : can be obtained as 
df 

— = 0.8xi - 72 = 0 or xf = 90 
dx\ 

As d 2 f{x*)/dx\ — 0.8 >0, the value of x* corresponds to the minimum of /. Thus 
the optimum solution is given by 

/min = /(x?) = $14,120 
x* — 90 and x| = 110 
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9.8 LINEAR PROGRAMMING AS A CASE OF DYNAMIC 
PROGRAMMING 

A linear programming problem with n decision variables and m constraints can 
be considered as an n -stage dynamic programming problem with m state vari- 
ables. In fact, a linear programming problem can be formulated as a dynamic 
programming problem. To illustrate the conversion of a linear programming problem 
into a dynamic programming problem, consider the following linear programming 
problem: 

n 

Maximize f (x \ , xn , . . . , x„ ) = cjXj 

i = i 

subject to 


y cijjXj < bj, i — 1,2, , m 

U (9-27) 

Xj > 0, j = 1 , 2, . . . , n 


This problem can be considered as an /i-stage decision problem where the value of 
the decision variable Xj must be determined at stage j . The right-hand sides of the 
constraints, bj, i = 1.2, ... , in , can be treated as m types of resources to be allocated 
among different kinds of activities Xj. For example, b\ may represent the available 
machines, /;? the available time, and so on, in a workshop. The variable x\ may denote 
the number of castings produced, X 2 the number of forgings produced, x^ the number 
of machined components produced, and so on, in the workshop. The constant Cj may 
represent the profit per unit of Xj. The coefficients represent the amount of 2 th 
resource bj needed for 1 unit of /th activity xj (e.g., the amount of material required 
to produce one casting). Flence when the value of the decision variable x/ at the j th 
stage is determined, a\jXj units of resource 1, o-ijXj units of resource 2, . . . , a mj X j 
units of resource m will be allocated to / th activity if sufficient unused resources exist. 
Thus the amounts of the available resources must be determined before allocating 
them to any particular activity. For example, when the value of the first activity x\ is 
determined at stage 1, there must be sufficient amounts of resources bj for allocation 
to activity 1 . The resources remaining after allocation to activity 1 must be determined 
before the value of X 2 is found at stage 2, and so on. In other words, the state of the 
system (i.e., the amounts of resources remaining for allocation) must be known before 
making a decision (about allocation) at any stage of the n -stage system. In this problem 
there are m state parameters constituting the state vector. 

By denoting the optimal value of the composite objective function over n stages 
as /*, we can state the problem as 

Find 


/: = f:(b l ,b 2 ,...,b m )= max 


J2 c i x i 


(9.28) 


570 Dynamic Programming 
such that 


n 



(9.29) 


Xj > 0, j = 1,2,...,/i 


(9.30) 


The recurrence relationship (9.16), when applied to this problem yields 


f*(P l , Pi, ■ ■ ■ , Pm) = max_ [axi + f*_ i (j6i - ai,x,- , 


0 <Xi<P 

Pi ~ anXi P m - a mi Xi )] , i=2,3,...,n (9.3 1 ) 


where P\, P 2 , An are the resources available for allocation at stage /; fl|,x,. . . . , 
a m ,x,- are the resources allocated to the activity x,-, A — a i ,■ x,- , p 2 — (i 2 iX , , . . . , p m — 
a mi Xi are the resources available for allocation to the activity i — 1, and /l indicates 
the maximum value that x- t can take without violating any of the constraints stated in 
Eqs. (9.29). The value of p is given by 


since any value larger than fi would violate at least one constraint. Thus at the / th 
stage, the optimal values x* and f* can be determined as functions of Pi, P 2 , ■ ■ ■ , Pm- 
Finally, at the nth stage, since the values of p 1 , /T, . . ., p m are known to be 
b\, Z? 2 , • • • , b m , respectively, we can determine x* and f*. Once x* is known, the 
remaining values, x*_ p x *_ 2 , . . . , x* can be determined by retracing the suboptimiza- 
tion steps. 


SOLUTION Since n = 2 and m — 3, this problem can be considered as a two-stage 
dynamic programming problem with three state parameters. The first-stage problem is 
to find the maximum value of f \ : 



(9.32) 


Example 9.5 f 


Maximize f(x 1 , X 2 ) — 50.ri + 100x2 


subject to 


10xi + 5 x 2 < 2500 
4xi + 10x 2 < 2000 
xj + 1.5x2 < 450 

xri > 0, X 2 > 0 


max f\(Pi, P 2 , P 3 , xi) = max_(50xi) 

o<*i<p 


^This problem is the same as the one stated in Example 3.2. 
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where /3j, /L, and /L, are the resources available for allocation at stage 1 , and xi is a 
nonnegative value that satisfies the side constraints 10 xi < Pu 4xi < ft, and x\ < fa. 
Here — 2500 — 5 x 2 , Pi = 2000 — 10 x 2 , and p$ = 450 — 1 . 5 x 2 , and hence the max- 
imum value p that X| can assume is given by 


P — x* — min 


2500 — 5x2 

To 


2000 - 10 x 2 
4 


,450 - 1.5x2 


(Ei) 


Thus 


* /2500 - 5x 2 2000 - 10x 2 \ 

f* ( jq— 4 450 - 1.5x 2 J = 5 Ox* 

.'2500 - 5x 2 2000 - 10x 2 

= 50 mm , , 450 — 1 . 5 x 2 

10 4 


The second- stage problem is to find the maximum value of f 2 : 

'2500 — 5x2 


max f 2 (Pi , Pi, P3) = max 

0<X2<fi 


100 x 2 + fl 
2000 - 10 x 2 


10 

450- 1.5x2 


(E 2 ) 


where p 1 , p 2 , and Pj are the resources available for allocation at stage 2 , which are 
equal to 2500, 2000, and 450, respectively. The maximum value that X 2 can assume 
without violating any constraint is given by 


P 


min 


2500 2000 450 \ 

“l - ’ To - ’ E57 


= 200 


Thus the recurrence relation, Eq. (E 2 ), can be restated as 


max f 2 ( 2500, 2000, 450) 


= max 

0<X2< 200 


™ ™ • /2500 -5X2 2000 - 10xt 

1 00x2 + 50 mm ( — , , 450 — 1.5x2 


Since 


/ 2500 — 5x 2 2000- 10 x 2 
mm | , : , 450 — 1 . 5 x 2 


V 10 4 

2500 — 5x2 

To 

2000 - 10 x 2 


if 0 < x 2 < 125 
if 125 < x 2 < 200 
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we obtain 


max 

0<X2< 200 


max 


, 2500 - 5x 2 2000 - 10x 2 
100x 2 + 50 min , , 450 — 1.5x 2 


100x 2 + 50 
100x 2 + 50 


10 

2500 - 5x 2 
K) 

2000 - 10x 2 


if 0 < x 2 < 125 
if 125 < x 2 < 200 


= max 


75x 2 + 12,500 if 0 < x 2 < 125 

25,000 - 25x 2 if 125 < x 2 < 200 


Now, 


max(75x 2 + 12,500) = 21,875 at x 2 = 125 
max (25, 000 — 25 x 2 ) = 21,875 at x 2 = 125 


Hence 


f 2 (2500, 2000, 450) = 21.875 at x* * = 125.0 


From Eq. (Ei) we have 


/ 2500 — 5x* 2000- 1 Ox* 

xf = min 2., 450 - 1.5x? 

1 V 10 4 2 

= min(187.5, 187.5, 262.5) = 187.5 


Thus the optimum solution of the problem is given by x* = 187.5, x| = 125.0, and 
/ max = 21,875.0, which can be seen to be identical with the one obtained earlier. 


Problem of Dimensionality in Dynamic Programming. The application of dynamic 
programming for the solution of a linear programming problem has a serious limitation 
due to the dimensionality restriction. The number of calculations needed will increase 
very rapidly as the number of decision variables and state parameters increases. As an 
example, consider a linear programming problem with 100 constraints. This means that 
there are 100 state variables. By the procedure outlined in Section 9.4, if a table of f* 
is to be constructed in which 100 discrete values (settings) are given to each parameter, 
the table contains 100 100 entries. This is a gigantic number, and if the calculations are 
to be performed on a high-speed digital computer, it would require 100 96 seconds or 
about 100 92 years’ merely to compute one table of f*. Like this, 100 tables have 
to be prepared, one for each decision variable. Thus it is totally out of the question 
to solve a general linear programming problem of any reasonable size* by dynamic 
programming. 


+ The computer is assumed to be capable of computing 10 8 values of f* per second. 

*As stated in Section 4.7, LP problems with 150,000 variables and 12,000 constraints have been solved in 
a matter of a few hours using some special techniques. 
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These comments are equally applicable for all dynamic programming problems 
involving many state variables, since the computations have to be performed for dif- 
ferent possible values of each of the state variables. Thus this problem causes not only 
an increase in the computational time, but also requires a large computer memory. This 
problem is known as the problem of dimensionality or the curse of dimensionality , as 
termed by Bellman. This presents a serious obstacle in solving medium- and large-size 
dynamic programming problems. 


9.9 CONTINUOUS DYNAMIC PROGRAMMING 

If the number of stages in a multistage decision problem tends to inbnity, the problem 
becomes an inbnite stage or continuous problem and dynamic programming can still 
be used to solve the problem. According to this notion, the trajectory optimization 
problems, defined in Section 1.5, can also be considered as infinite-stage or continuous 
problems . 

An infinite-stage or continuous decision problem may arise in several practical 
problems. For example, consider the problem of a missile hitting a target in a specified 
(finite) time interval. Theoretically, the target has to be observed and commands to the 
missile for changing its direction and speed have to be given continuously. Thus an 
infinite number of decisions have to be made in a finite time interval. Since a stage has 
been defined as a point where decisions are made, this problem will be an infinite-stage 
or continuous problem. Another example where an infinite-stage or continuous decision 
problem arises is in planning problems. Since large industries are assumed to function 
for an indefinite amount of time, they have to do their planning on this basis. They 
make their decisions at discrete points in time by anticipating a maximum profit in the 
long run (essentially over an infinite period of time). In this section we consider the 
application of continuous decision problems. 

We have seen that the objective function in dynamic programming formulation 
is given by the sum of individual stage returns. If the number of stages tends 
to infinity, the objective function will be given by the sum of infinite terms, 
which amounts to having the objective function in the form of an integral. The 
following examples illustrate the formulation of continuous dynamic programming 
problems. 

Example 9.6 Consider a manufacturing firm that produces a certain product. The rate 
of demand of this product (p) is known to be p — p[x(t), 1 1, where t is the time of 
the year and x(t) is the amount of money spent on advertisement at time t. Assume 
that the rate of production is exactly equal to the rate of demand. The production cost, 
c, is known to be a function of the amount of production ( p ) and the production rate 
(dp/dt) as c — c( p, dp/dt). The problem is to find the advertisement strategy, x(t), 
so as to maximize the profit between t\ and h- The unit selling price (,sj of the product 
is known to be a function of the amount of production as s = s(p) — a + b/ p, where 
a and b are known positive constants. 

SOLUTION Since the profit is given by the difference between the income from sales 
and the expenditure incurred for production and advertisement, the total profit over the 
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period t\ to t 2 is given by 

/ 




p [a + 


dp 


*(?) 


dt 


(Ei) 


where p — p{x{t), r}. Thus the optimization problem can be stated as follows: Find 
x(t), t\ < t < l 2 , which maximizes the total profit, / given by Eq. (Ei). 


Example 9.7 Consider the problem of determining the optimal temperature distribu- 
tion in a plug-flow tubular reactor [ 9 . 1 ]. Let the reactions carried in this type of reactor 
be shown as follows: 

Xi X 3 

h 

where X\ is the reactant, AT the desired product, and A'3 the undesired product, and 
k\, k.2, and *3 are called rate constants. Let x\ and x 2 denote the concentrations of the 
products X\ and AT, respectively. The equations governing the rate of change of the 
concentrations can be expressed as 


+ k\x\ = k 2 x 2 
ay 

(Ei) 

dx t 

— b k 2 x 2 + hx 2 = *1*1 

ay 

(E 2 ) 


with the initial conditions x\{y = 0) = c\ and x 2 (y = 0) = c 2 , where y is the normal- 
ized reactor length such that 0<y < 1. In general, the rate constants depend on the 
temperature (t) and are given by 

ki= ai e~ ibtlt \ * = 1 , 2,3 (E 3 ) 

where a t and bj are constants. 

If the objective is to determine the temperature distribution t(y), 0 < y < 1 , to 
maximize the yield of the product X 2 , the optimization problem can be stated as 
follows: 

Find t(y), 0 < y < 1 , which maximizes 

*2(1) - *2(0) = / dx 2 = / (£1*1 - k 2 X 2 - k-iX 2 )dy 

Jy= 0 JO 

where x t (y) and a' 2 (v) have to satisfy Eqs. (Ej) and (E 2 ). Flere it is assumed that the 
desired temperature can be produced by some external heating device. 

The classical method of approach to continuous decision problems is by the calcu- 
lus of variations. + Flowever, the analytical solutions, using calculus of valuations, cannot 
be obtained except for very simple problems. The dynamic programming approach, on 
the other hand, provides a very efficient numerical approximation procedure for solving 
continuous decision problems. To illustrate the application of dynamic programming 


tSee Section 12.2 for additional examples of continuous decision problems and the solution techniques using 
calculus of variations. 
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to the solution of continuous decision problems, consider the following simple (uncon- 
strained) problem. Find the function y(x ) that minimizes the integral 


/ = 



(9.33) 


subject to the known end conditions y(x — a) — a, and y (x = b) = ft. We shall see 
how dynamic programming can be used to determine y(x ) numerically. This approach 
will not yield an analytical expression for y(x) but yields the value of y(x) at a finite 
number of points in the interval a < x <b. To start with, the interval (a, b) is divided 
into n segments each of length Ax (all the segments are assumed to be of equal length 
only for convenience). The grid points defining the various segments are given by 


x\ — a,X 2 — a + Ax , . . . , 

Xi = a + (i — l)Ax, . . . , x n+ \ — a + n Ax — b 


If Ax is small, the derivative cly/dx at x, can be approximated by a forward difference 
formula as 

= «±p! ,9.34) 

dx Ax 


where y,- = y(x,), i — 
mated as 


1,2, ...,n + l. The integral in Eq. (9.33) can be approxi- 




dy 

■ 3 - (*>•). y(xi),Xf 
ax 


Ax 


(9.35) 


Thus the problem can be restated as 

Find y(x 2 ), y(x 3), . . ., y(x n ), which minimizes 


/ - Ax 

i = 1 



y ,+ 1 - yi 

Ax 


yi,Xi 


(9.36) 


subject to the known conditions y\ = a and y n +i — ft- 

This problem can be solved as a final value problem. Let 


f*(9) = min 

yi+uyi+i.—yn 



yk + 1 - yk 

Ax 


yk, x k 



(9.37) 


where 9 is a parameter representing the various values taken by y, . Then f*{6) can 
also be written as 


f*(0) = min 

yt + 1 


R 


y/+i - & 
Ax 


, 9, Xi 


Ax + f* +x (ji+ 1) 


(9.38) 


This relation is valid for i 


= 1,2, — 1, and 

/ - (9) = s (^" 6 ’ x ") 


Ax 


(9.39) 


Finally the desired minimum value is given by / o *(0 = a). 
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In Eqs. (9.37) to (9.39), 9 or y,- is a continuous variable. However, for simplicity, 
we treat 9 or y, as a discrete variable. Hence for each value of i, we find a set of 
discrete values that 9 or >7 can assume and find the value of f*(9) for each discrete 
value of 9 or y,-. Thus f* (9) will be tabulated for only those discrete values that 9 can 
take. At the final stage, we find the values of /,*(«) and y*. Once y* is known, the 
optimal values of yi, y 3 , . . . , y n can easily be found without any difficulty, as outlined 
in the previous sections. 

It can be seen that the solution of a continuous decision problem by dynamic 
programming involves the determination of a whole family of extremal trajectories as 
we move from b toward a. In the last step we find the particular extremal trajectory that 
passes through both points (a, a) and ( b , fi). This process is illustrated in Fig. 9.16. In 
this figure, f*(6) is found by knowing which of the extremal trajectories that terminate 
at Xi+\ pass through the point (x-, , 9). If this procedure is followed, the solution of a 
continuous decision problem poses no additional difficulties. Although the simplest type 
of continuous decision problem is considered in this section, the same procedure can be 
adopted to solve any general continuous decision problem involving the determination 
of several functions, yi(x), >' 2 (x ) , . . . , y,y (x ) subject to m constraints (m < N) in the 
form of differential equations [9.3]. 


9.10 ADDITIONAL APPLICATIONS 

Dynamic programming has been applied to solve several types of engineering problems. 
Some representative applications are given in this section. 

9.10.1 Design of Continuous Beams 

Consider a continuous beam that rests on n rigid supports and carries a set of pre- 
scribed loads P\, P 2 , , P„ as shown in Fig. 9.17 [9.11]. The locations of the supports 
are assumed to be known and the simple plastic theory of beams is assumed to 



Figure 9.16 Solution of a continuous dynamic programming problem. 
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span 1 span 2 span r 

Figure 9.17 Continuous beam on rigid supports. 


span n 


be applicable. Accordingly, the complete bending moment distribution can be deter- 
mined once the reactant support moments nt\, m 2 , . . . , m n are known. Once the support 
moments are known (chosen), the plastic limit moment necessary for each span can be 
determined and the span can be designed. The bending moment at the center of the r'th 
span is given by — P,/,7 4 and the largest bending moment in the r'th span, M { , can be 
computed as 


Mi — max 




mt - 1 + m, 
2 



i — 1,2 , ,n 


(9.40) 


If the beam is uniform in each span, the limit moment for the ith span should be greater 
than or equal to M,-. The cross section of the beam should be selected so that it has 
the required limit moment. Thus the cost of the beam depends on the limit moment it 
needs to carry. The optimization problem becomes 


n 

Find X = {mi, m 2 , ... , m „ } T which minimizes X><X) 

i = 1 

while satisfying the constraints m, > M,-, i — 1,2 where P, denotes the cost of 
the beam in the ith span. This problem has a serial structure and hence can be solved 
using dynamic programming. 


9.10.2 Optimal Layout (Geometry) of a Truss 

Consider the planar, multibay, pin-jointed cantilever truss shown in Fig. 9.18 [9.11, 
9.12, 9.22]. The configuration of the truss is defined by the x and y coordinates of 
the nodes. By assuming the lengths of the bays to be known (assumed to be unity in 
Fig. 9.18) and the truss to be symmetric about the x axis, the coordinates yi,yi , . . . , y n 
define the layout (geometry) of the truss. The truss is subjected to a load (assumed to 
be unity in Fig. 9.18) at the left end. The truss is statically determinate and hence the 
forces in the bars belonging to bay i depend only on _y, _ ] and y,- and not on other 
coordinates yi,y 2 , •••, Vi - 2 , y ; + 1 , . . . , y n . Once the length of the bar and the force 
developed in it are known, its cross-sectional area can be determined. This, in turn, 
dictates the weight/cost of the bar. The problem of optimal layout of the truss can be 
formulated and solved as a dynamic programming problem. 
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— *- x 


Bay 1 1 Bay 2 1 Bay 3 1 

Figure 9.18 Multibay cantilever truss. 


For specificness, consider a three-bay truss for which the following relationships 
are valid (see Fig. 9.18): 


yi+ 1 = yi + dj, i = 1, 2, 3 (9.41) 

Since the value of yi is fixed, the problem can be treated as an initial value problem. 
If the y coordinate of each node is limited to a finite number of alternatives that can 
take one of the four values 0.25, 0.5, 0.75, 1 (arbitrary units are used), there will be 
64 possible designs, as shown in Fig. 9.19. If the cost of each bay is denoted by R t , 
the resulting multistage decision problem can be represented as shown in Fig. 9.5a. 



Figure 9.19 Possible designs of the cantilever truss. 
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9.10.3 Optimal Design of a Gear Train 

Consider the gear train shown in Fig. 9.20, in which the gear pairs are numbered from 
1 to n. The pitch diameters (or the number of teeth) of the gears are assumed to be 
known and the face widths of the gear pairs are treated as design variables [9.19, 9.20]. 
The minimization of the total weight of the gear train is considered as the objective. 
When the gear train transmits power at any particular speed, bending and surface wear 
stresses will be developed in the gears. These stresses should not exceed the respective 
permissible values for a safe design. The optimization problem can be stated as 

n 

Find X = [x\ , X 2 , . ■ . , x n ] T which minimizes ^ R,Q () (9.42) 

i = 1 

subject to 

&biQ^') — max? &wiQ^) — ^lomax. i ~ 1, 2, . . . , Tl 

where x, is the face width of gear pair i, Rj the weight of gear pair ;, rr/„ ( o wi ) the 
bending (surface wear) stress induced in gear pair i, and a/, max (& w ma x) the maxi- 
mum permissible bending (surface wear) stress. This problem can be considered as a 
multistage decision problem and can be solved using dynamic programming. 

9.10.4 Design of a Minimum-Cost Drainage System 

Underground drainage systems for stormwater or foul waste can be designed efficiently 
for minimum construction cost by dynamic programming [9.14]. Typically, a drainage 
system forms a treelike network in plan as shown in Fig. 9.21. The network slopes 
downward toward the outfall, using gravity to convey the wastewater to the outfall. 
Manholes are provided for cleaning and maintenance purposes at all pipe junctions. 
A representative three-element pipe segment is shown in Fig. 9.22. The design of an 



Figure 9.20 Gear train. 
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Figure 9.22 Representation of a three-element pipe segment [9.14], 
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element consists of selecting values for the diameter of the pipe, the slope of the 
pipe, and the mean depth of the pipe (£>,-, hi- 1 , and hi). The construction cost of an 
element, Rj, includes cost of the pipe, cost of the upstream manhole, and earthwork 
related to excavation, backfilling, and compaction. Some of the constraints can be stated 
as follows: 

1. The pipe must be able to discharge the specified flow. 

2. The flow velocity must be sufficiently large. 

3. The pipe slope must be greater than a specified minimum value. 

4. The depth of the pipe must be sufficient to prevent damage from surface 
activities. 

The optimum design problem can be formulated and solved as a dynamic programming 
problem. 
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REVIEW QUESTIONS 

9.1 What is a multistage decision problem? 

9.2 What is the curse of dimensionality? 

9.3 State two engineering examples of serial systems that can be solved by dynamic 
programming. 

9.4 What is a return function? 

9.5 What is the difference between an initial value problem and a final value problem? 

9.6 How many state variables are to be considered if an LP problem with n variables and m 
constraints is to be solved as a dynamic programming problem? 

9.7 How can you solve a trajectory optimization problem using dynamic programming? 

9.8 Why are the components numbered in reverse order in dynamic programming? 

9.9 Define the following terms: 

(a) Principle of optimality 

(b) Boundary value problem 

(c) Monotonic function 

(d) Separable function 

9.10 Answer true or false: 

(a) Dynamic programming can be used to solve nonconvex problems. 

(b) Dynamic programming works as a decomposition technique. 
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(c) The objective function, / = (R i + is separable. 

(d) A nonserial system can always be converted to an equivalent serial system by 
regrouping the components. 

(e) Both the input and the output variables are specified in a boundary value problem. 

(f) The state transformation equations are same as the design equations. 

(g) The principle of optimality and the concept of suboptimization are same. 

(h) A final value problem can always be converted into an initial value problem. 


PROBLEMS 


9.1 Four types of machine tools are to be installed (purchased) in a production shop. The 
costs of the various machine tools and the number of jobs that can be performed on each 
are given below. 


Machine tool type 

Cost of machine 
tool ($) 

Number of jobs that can be 
performed 

1 

3500 

9 

2 

2500 

4 

3 

2000 

3 

4 

1000 

2 


If the total amount available is $10,000, determine the number of machine tools of various 
types to be purchased to maximize the number of jobs performed. Note: The number of 
machine tools purchased must be integers. 

9.2 The routes of an airline, which connects 16 cities (A, B, . . . , P ), are shown in Fig. 9.23. 
Journey from one city to another is possible only along the lines (routes) shown, with 
the associated costs indicated on the path segments. If a person wants to travel from city 
A to city P with minimum cost, without any backtracking, determine the optimal path 
(route) using dynamic programming. 

9.3 A system consists of three subsystems in series, with each subsystem consisting of several 
components in parallel, as shown in Fig. 9.24. The weights and reliabilities of the various 
components are given below: 


Subsystem, i 

Weight of each component, 
wi (lb) 

Reliability of each 
component, r, 

1 

4 

0.96 

2 

2 

0.92 

3 

6 

0.98 


The reliability of subsystem i is given by Rj = 1 — (1 — r, ■)">', 1 = 1,2,3, where w,- is the 
number of components connected in parallel in subsystem i, and the overall reliability of 
the system is given by Rq = R 1 R 2 R 3 . It was decided to use at least one and not more 
than three components in any subsystem. The system is to be transported into space by 
a space shuttle. If the total payload is restricted to 20 lb, find the number of components 
to be used in the three subsystems to maximize the overall reliability of the system. 
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Components of Components of Components of 

type 1 type 2 type 3 



Subsystem 1 Subsystem 2 Subsystem 3 

Figure 9.24 Three subsystems connected in series. 


9.4 The altitude of an airplane flying between two cities A and F, separated by a distance of 
2000 miles, can be changed at points B, C, D, and E (Fig. 9.25). The fuel cost involved 
in changing from one altitude to another between any two consecutive points is given in 
the following table. Determine the altitudes of the airplane at the intermediate points for 
minimum fuel cost. 
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8.000 < 

— 

— 

— 
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400 

miles 


400 

miles 



Figure 9.25 Altitudes of the airplane in Example 9.4. 


From altitude (ft): 



To altitude (ft): 



0 

8,000 

16,000 

24,000 

32,000 

40,000 

0 

— 

4000 

4800 

5520 

6160 

6720 

8,000 

800 

1600 

2680 

4000 

4720 

6080 

16,000 

320 

480 

800 

2240 

3120 

4640 

24,000 

0 

160 

320 

560 

1600 

3040 

32,000 

0 

0 

80 

240 

480 

1600 

40,000 

0 

0 

0 

0 

160 

240 


9.5 Determine the path (route) corresponding to minimum cost in Problem 9.2 if a person 
wants to travel from city D to city M. 

9.6 Each of the n lathes available in a machine shop can be used to produce two types of 
parts. If z lathes are used to produce the first part, the expected profit is 3z and if z 
of them are used to produce the second part, the expected profit is 2.5 z. The lathes are 
subject to attrition so that after completing the first part, only zl 3 out of z remain available 
for further work. Similarly, after completing the second part, only 2z/3 out of z remain 
available for further work. The process is repeated with the remaining lathes for two more 
stages. Find the number of lathes to be allocated to each part at each stage to maximize the 
total expected profit. Assume that any nonnegative real number of lathes can be assigned 
at each stage. 

9.7 A minimum-cost pipeline is to be laid between points (towns) A and E. The pipeline is 
required to pass through one node out of Si, B 2 , and 63 , one out of C 1 , C 2 , and C 3 , 
and one out of Di, D 2 , and D 3 (see Fig. 9.26). The costs associated with the various 
segments of the pipeline are given below: 


For the segment starting at A For the segment ending at E 


A-Bi 

10 

D\-E 

9 

A-Bn 

15 

d 2 -e 

6 

A-B 3 

12 

d 3 -e 

12 
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B i Ci Di 



For the segments B, to Cj and C, to Dj 


From node i 


To node j 


1 

2 

3 

1 

8 

12 

19 

2 

9 

11 

13 

3 

7 

15 

14 


Find the solution using dynamic programming. 

9.8 Consider the problem of controlling a chemical reactor. The desired concentration of 
material leaving the reactor is 0.8 and the initial concentration is 0.2. The concentration 
at any time t, x{t), is given by 


dx 

dt 


1+x 


u(t) 


where u(t) is a design variable (control function). 
Find u(t) which minimizes 


f=[ {l x (t) — 0.8] 2 + u 2 (t)} dt 
J o 


subject to 


0 < u(t) < 1 


Choose a grid and solve u(t ) numerically using dynamic programming. 

9.9 It is proposed to build thermal stations at three different sites. The total budget available 
is 3 units (1 unit = $10 million) and the feasible levels of investment on any thermal 
station are 0, 1, 2, or 3 units. The electric power obtainable (return function) for different 
investments is given below: 
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Return function, R, (jc) 

Thermal Station, 

i 

1 

2 

3 

Ri( 0) 

0 

0 

0 


2 

1 

3 

Ri( 2) 

4 

5 

5 

Ri( 3) 

6 

6 

6 


Find the investment policy for maximizing the total electric power generated. 

9.10 Solve the following LP problem by dynamic programming: 

Maximize f(x i, xi) = 10xi + 8 x 2 

subject to 

2x\ + X 2 < 25 
3xi + 2x2 < 45 
x 2 < 10 
X\ > 0, X2 > 0 

Verify your solution by solving it graphically. 

9.11 A fertilizer company needs to supply 50 tons of fertilizer at the end of the first month, 
70 tons at the end of second month, and 90 tons at the end of third month. The cost of 
producing x tons of fertilizer in any month is given by $(4500.r + 20x 2 ). It can produce 
more fertilizer in any month and supply it in the next month. However, there is an 
inventory carrying cost of $400 per ton per month. Find the optimal level of production 
in each of the three periods and the total cost involved by solving it as an initial value 
problem. 

9.12 Solve Problem 9.11 as a final value problem. 

9.13 Solve the following problem by dynamic programming: 

3 

Maximize > d } 
dii 0 

1 = 1 

subject to 

di=x i+ i-Xi, ( = 1,2,3 

xi =0, 1,2, ...,5, ( = 1,2 
X 3 = 5, X 4 = 0 


Integer Programming 



10.1 INTRODUCTION 


In all the optimization techniques considered so far, the design variables are assumed 
to be continuous, which can take any real value. In many situations it is entirely 
appropriate and possible to have fractional solutions. For example, it is possible to use 
a plate of thickness 2.60 mm in the construction of a boiler shell, 3.34 hours of labor 
time in a project, and 1.78 lb of nitrate to produce a fertilizer. Also, in many engineering 
systems, certain design variables can only have discrete values. For example, pipes 
carrying water in a heat exchanger may be available only in diameter increments of l 
in. However, there are practical problems in which the fractional values of the design 
variables are neither practical nor physically meaningful. For example, it is not possible 
to use 1.6 boilers in a thermal power station, 1.9 workers in a project, and 2.76 lathes 
in a machine shop. If an integer solution is desired, it is possible to use any of the 
techniques described in previous chapters and round off the optimum values of the 
design variables to the nearest integer values. However, in many cases, it is very 
difficult to round off the solution without violating any of the constraints. Frequently, 
the rounding of certain variables requires substantial changes in the values of some 
other variables to satisfy all the constraints. Further, the round-off solution may give 
a value of the objective function that is very far from the original optimum value. All 
these difficulties can be avoided if the optimization problem is posed and solved as an 
integer programming problem. 

When all the variables are constrained to take only integer values in an opti- 
mization problem, it is called an all-integer programming problem. When the vari- 
ables are restricted to take only discrete values, the problem is called a discrete 
programming problem. When some variables only are restricted to take integer (dis- 
crete) values, the optimization problem is called a mixed-integer (discrete) program- 
ming problem. When all the design variables of an optimization problem are allowed 
to take on values of either zero or 1, the problem is called a zero-one program- 
ming problem. Among the several techniques available for solving the all-integer and 
mixed-integer linear programming problems, the cutting plane algorithm of Gomory 
[10.7] and the branch-and-bound algorithm of Land and Doig [10.8] have been quite 
popular. Although the zero-one linear programming problems can be solved by the 
general cutting plane or the branch-and-bound algorithms, Balas [10.9] developed an 
efficient enumerative algorithm for solving those problems. Very little work has been 
done in the field of integer nonlinear programming. The generalized penalty function 
method and the sequential linear integer (discrete) programming method can be used to 
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10.2 Graphical Representation 589 


Table 10.1 Integer Programming Methods 


Linear programming problems 

f —h 


All-integer 

problem 


Mixed-integer 

problem 


Cutting plane method 
Branch-and-bound method 


Zero-one 

problem 


Nonlinear programming problems 

. I . 


Polynomial 

programming 

problem 


Cutting plane method 
Branch-and-bound method 
Balas method 


General nonlinear 
problem 


All-integer 

problem 


Mixed-integer 

problem 


Generalized penalty function 
method 

Sequential linear integer 
(discrete) programming 
method 


solve all integer and mixed-integer nonlinear programming problems. The various solu- 
tion techniques of solving integer programming problems are summarized in Table 10.1. 
All these techniques are discussed briefly in this chapter. 


Integer Linear Programming 

10.2 GRAPHICAL REPRESENTATION 

Consider the following integer programming problem: 

Maximize /(X) = 3a'i + 4x2 

subject to 

3xi — %2 < 12 

3xi + 1 1x2 < 66 

xj > 0 (10.1) 

x 2 > 0 

x\ and X 2 are integers 

The graphical solution of this problem, by ignoring the integer requirements, is shown in 
Fig. 10.1. It can be seen that the solution is xi = 5^, X 2 = 4^ with a value of / = 34^. 
Since this is a noninteger solution, we truncate the fractional parts and obtain the new 
solution as X] =5, X 2 = 4, and / = 31. By comparing this solution with all other 
integer feasible solutions (shown by dots in Fig. 10.1), we find that this solution is 
optimum for the integer LP problem stated in Eqs. (10.1). 

It is to be noted that truncation of the fractional part of a LP problem will not 
always give the solution of the corresponding integer LP problem. This can be illustrated 
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Figure 10.1 Graphical solution of the problem stated in Eqs. (10.1). 


by changing the constraint 3x\ + 11%2 < 66 to lx\ + 11x2 < 88 in Eqs. (10.1). With 
this altered constraint, the feasible region and the solution of the LP problem, without 
considering the integer requirement, are shown in Fig. 10.2. The optimum solution 
of this problem is identical with that of the preceding problem: namely, xi = 5^, 



Figure 10.2 Graphical solution with modified constraint. 


10.3 Gomory’s Cutting Plane Method 591 


X 2 — 4^, and / = 34^. The truncation of the fractional part of this solution gives 
x\ — 5, X 2 — 4, and f — 31. Although this truncated solution happened to be optimum 
to the corresponding integer problem in the earlier case, it is not so in the present case. 
In this case the optimum solution of the integer programming problem is given by 
x* = 0, x* = 8, and /* = 32. 

10.3 GOMORY'S CUTTING PLANE METHOD 
10.3.1 C oncept of a C utting Plane 

Gomory’s method is based on the idea of generating a cutting plane. To illustrate 
the concept of a cutting plane, we again consider the problem stated in Eqs. (10.1). 
The feasible region of the problem is denoted by ABCD in Fig. 10.1. The optimal 
solution of the problem, without considering the integer requirement, is given by point 
C. This point corresponds to x\ = 5^, X 2 = 4j, and / = 34^, which is not optimal to 
the integer programming problem since the values of x\ and xt are not integers. The 
feasible integer solutions of the problem are denoted by dots in Fig. 10.1. These points 
are called the integer lattice points. 

In Fig. 10.3, the original feasible region is reduced to a new feasible region 
ABEFGD by including the additional (arbitrarily selected) constraints. The idea behind 
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adding these additional constraints is to reduce the original feasible convex region 
ABCD to a new feasible convex region (such as ABEFGD ) such that an extreme 
point of the new feasible region becomes an integer optimal solution to the integer 
programming problem. There are two main considerations to be taken while select- 
ing the additional constraints: (1) the new feasible region should also be a convex 
set, and (2) the part of the original feasible region that is sliced off because of the 
additional constraints should not include any feasible integer solutions of the original 
problem. 

In Fig. 10.3, the inclusion of the two arbitrarily selected additional constraints PQ 
and P'Q' gives the extreme point F{x\ = 5, X 2 = 4, / = 31) as the optimal solution 
of the integer programming problem stated in Eqs. (10.1). Gomory’s method is one in 
which the additional constraints are developed in a systematic manner. 


10.3.2 Gomory's Method for All-Integer Programming Problems 

In this method the given problem [Eqs. (10.1)] is first solved as an ordinary LP problem 
by neglecting the integer requirement. If the optimum values of the variables of the 
problem happen to be integers, there is nothing more to be done since the integer 
solution is already obtained. On the other hand, if one or more of the basic variables 
have fractional values, some additional constraints, known as Gomory constraints, 
that will force the solution toward an all-integer point will have to be introduced. To 
see how the Gomory constraints are generated, let the tableau corresponding to the 
optimum (noninteger) solution of the ordinary LP problem be as shown in Table 10.2. 
Here it is assumed that there are a total of m + n variables (n original variables plus 
m slack variables). At the optimal solution, the basic variables are represented as 
x- t (i = 1 , 2 ,..., m) and the nonbasic variables as yj (j = 1 , 2 ,..., n) for convenience. 

Gomory’s Constraint. From Table 10.2, choose the basic variable with the largest 
fractional value. Let this basic variable be jc,- . When there is a tie in the fractional 
values of the basic variables, any of them can be taken as x- t . This variable can be 


Table 10.2 Optimum Noninteger Solution of Ordinary LP Problem 


Basic 

variables 



Coefficient corresponding to: 


Objective 

function 

Constants 

X\ 

X2 ■ 

..Xi . 

■ • 

y i 

yi ■■■ 

yj ■ 

■■ y n 

x\ 

1 

0 

0 

0 

flu 

fl 12 

«i j 

& 1 n 

0 

bi 

X2 

0 

1 

0 

0 

«21 

«22 

a 2j 

«2 n 

0 

bi 

Xi 

0 

0 

1 

0 

an 

an 

aij 

a in 

0 

bi 

x m 

0 

0 

0 

1 

a ml 

& m2 

flmj 

&mn 

0 

b m 

f 

0 

0 . 

.. 0 . 

.. 0 

Cl 

C2 

Cj 

Cn 

1 

f 
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expressed, from the /th equation of Table 10.2, as 

n 

X, = bi - 22 Oijyj ( 10 . 2 ) 

i = i 

where h l is a noninteger. Let us write 

b i =b i +p i (10.3) 

aij — a.jj + oiij (10.4) 

where bj and a,j denote the integers obtained by truncating the fractional parts from /?,- 
and ciij, respectively. Thus /J, will be a strictly positive fraction (0 < /!,■ < I ) and a- t j 
will be a nonnegative fraction (0 < a,, < 1). With the help of Eqs. (10.3) and (10.4), 
Eq. (10.2) can be rewritten as 

n n 

Pi - 22 a ‘j y j = Xi _ + 22 h ‘i y i <'10.5) 

j = i j = i 

Since all the variables x, and yj must be integers at an optimal integer solution, the 
right-hand side of Eq. (10.5) must be an integer. Thus we obtain 

n 

Pi - 22 a 'j y j = mic § cr ( io - 6 ) 

7 = 1 

Notice that aij are nonnegative fractions and yj are nonnegative integers. Hence the 
quantity ^” =1 cy f/ - y 7 - will always be a nonnegative number. Since /),■ is a strictly positive 
fraction, we have 




< Pi < 1 


(10.7) 


As the quantity yPi — a<j yj j has to be an integer [from Eq. (10.6)], it can be 
either a zero or a negative integer. Hence we obtain the desired constraint as 

n 

+Pi — 22 0li i y j — 0 ( 10 - 8 ) 

7=1 


By adding a nonnegative slack variable s t , the Gomory constraint equation becomes 

n 

Sj - 22 a 'j y J = ~P‘ <" 1 0.9) 

7=1 


where Si must also be an integer by dehnition. 
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Table 10.3 Optimal Solution with Gomory Constraint 


Basic 

variables 




Coefficient corresponding to: 


f 

Si 

Constants 

Xi 

*2 • 

..Xi 

• • • %m 

yi 

y2 ••• 

yj •• 

■ y n 

XI 

1 

0 

0 

0 

flu 

012 

aij 

d\n 

0 

0 

bi 

X2 

0 

1 

0 

0 

021 

022 

«2 j 

n 

0 

0 

b 2 

x, 

0 

0 

1 

0 

Oil 

0/2 

a ‘j 

& in 

0 

0 

bi 

x m 

0 

0 

0 

1 

1 

<2/n2 

a mj 

Clmn 

0 

0 

b,n 

f 

0 

0 

0 

0 

Cl 

C2 

Cj 

C n 

1 

0 

f 

Si 

0 

0 

0 

0 

-a n 

-an 

-aij 

&in 

0 

1 

-ft 


Computational Procedure. Once the Gomory constraint is derived, the coefficients of 
this constraint are inserted in a new row of the final tableau of the ordinary LP problem 
(i.e., Table 10.2). Since all y ; = 0 in Table 10.2, the Gomory constraint equation (10.9), 
becomes 

si — — Pi — negative 

which is infeasible. This means that the original optimal solution is not satisfying this 
new constraint. To obtain a new optimal solution that satisfies the new constraint, 
Eq. (10.9), the dual simplex method discussed in Chapter 4 can be used. The new 
tableau, after adding the Gomory constraint, is as shown in Table 10.3. 

After Ending the new optimum solution by applying the dual simplex method, test 
whether the new solution is all-integer or not. If the new optimum solution is all-integer, 
the process ends. On the other hand, if any of the basic variables in the new solution 
take on fractional values, a new Gomory constraint is derived from the new simplex 
tableau and the dual simplex method is applied again. This procedure is continued until 
either an optimal integer solution is obtained or the dual simplex method indicates that 
the problem has no feasible integer solution. 

Remarks: 

1. If there is no feasible integer solution to the given (primal) problem, this can 
be detected by noting an unbounded condition for the dual problem. 

2. The application of the dual simplex method to remove the infeasibility of 
Eq. (10.9) is equivalent to cutting off the original feasible solution toward the 
optimal integer solution. 

3. This method has a serious drawback. This is associated with the round-off 
errors that arise during numerical computations. Due to these round-off errors, 
we may ultimately get a wrong optimal integer solution. This can be rectified by 
storing the numbers as fractions instead of as decimal quantities. However, the 
magnitudes of the numerators and denominators of the fractional numbers, after 
some calculations, may exceed the capacity of the computer. This difficulty can 
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be avoided by using the all-integer integer programming algorithm developed 
by Gomory [10.10]. 

4 . For obtaining the optimal solution of an ordinary LP problem, we start from a 
basic feasible solution (at the start of phase II) and find a sequence of improved 
basic feasible solutions until the optimum basic feasible solution is found. Dur- 
ing this process, if the computations have to be terminated at any stage (for 
some reason), the current basic feasible solution can be taken as an approx- 
imation to the optimum solution. Flowever, this cannot be done if we apply 
Gomory’s method for solving an integer programming problem. This is due to 
the fact that the problem remains infeasible in the sense that no integer solu- 
tion can be obtained until the whole problem is solved. Thus we will not be 
having any good integer solution that can be taken as an approximate optimum 
solution in case the computations have to be terminated in the middle of the 
process. 

5 . From the description given above, the number of Gomory constraints to be 
generated might appear to be very large, especially if the solution converges 
slowly. If the number of constraints really becomes very large, the size of the 
problem also grows without bound since one (slack) variable and one constraint 
are added with the addition of each Gomory constraint. Flowever, it can be 
observed that the total number of constraints in the modified tableau will not 
exceed the number of variables in the original problem, namely, n + m. The 
original problem has m equality constraints in n + m variables and we observe 
that there are n nonbasic variables. When a Gomory constraint is added, the 
number of constraints and the number of variables will each be increased by 
one, but the number of nonbasic variables will remain n. Flence at most n 
slack variables of Gomory constraints can be nonbasic at any time, and any 
additional Gomory constraint must be redundant. In other words, at most n 
Gomory constraints can be binding at a time. If at all a (n + l)th constraint is 
there (with its slack variable as a basic and positive variable), it must be implied 
by the remaining constraints. Flence we drop any Gomory constraint once its 
slack variable becomes basic in a feasible solution. 


Example 10.1 


Minimize / = — 3x\ — 4x2 


subject to 


3xi — X2 + X3 = 12 
3xi + 1 1x2 + X4 = 66 
Xj >0, i = 1 to 4 
all Xj are integers 


This problem can be seen to be same as the one stated in Eqs. (10.1) with the addition 
of slack variables X3 and X4. 
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SOLUTION 

Step 1: Solve the LP problem by neglecting the integer requirement of the variables 
Xi, i = 1 to 4, using the regular simplex method as shown below: 


Basic 



Coefficients of variables 





variables 


Xl 

X2 

X3 

X 4 

-/ 

bi 

bi/ai s for a is > 0 

*3 


3 

-1 

1 

0 

0 

12 


X 4 


3 


0 

i 

0 

66 

6 4 - 
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element 






-f 


-3 

-4 

0 

0 

l 

0 





f 

Most negative 

G 





Result of pivoting: 







*3 


36 

1 1 

0 

1 

1 

11 

0 

18 

y <■— Smaller 


I 

fivo 






one 


element 






*2 


3 

11 

1 

0 

1 

11 

0 

6 

22 

-f 


21 

11 

0 

0 

4 

11 

1 

24 




t 








Most negative Cj 






Result of pivoting: 







XI 


l 

0 

11 

36 

1 

36 

0 

11 

2 


x 2 


0 

1 

1 

12 

1 

12 

0 

9 

2 




0 

0 

7 

12 

5 

12 

1 

69 

2 



Since all the cost coefficients are nonnegative, the last tableau gives the opti- 
mum solution as 


x\ = y, X2 — §, *3 =0, X 4 — 0, /min = -y 

which can be seen to be identical to the graphical solution obtained in 
Section 10.2. 

Step 2: Generate a Gomory constraint. Since the solution above is noninteger, a 
Gomory constraint has to be added to the last tableau. Since there is a tie 
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between x\ and X2, let us select x\ as the basic variable having the largest 
fractional value. From the row corresponding to x\ in the last tableau, we can 
write 


*1 = T - 35 >i - JSy^ (El) 

where yi and y 2 are used in place of xj, and X4 to denote the nonbasic variables. 
By comparing Eq. (Ei) with Eq. (10.2), we find that 

i = 1. *1 = y , *i=5, Pi = j, an = 35 , 

*11=0, an = a 12 = ^, *12 = 0, and a n = 35 

From Eq. (10.9), the Gomory constraint can be expressed as 


si - auyi - a 12 y2 - - Pi (E 2 ) 

where si is a new nonnegative (integer) slack variable. Equation (E 2 ) can be 
written as 


51 36 36 y 2 2 (E3) 

By introducing this constraint, Eq. (E3), into the previous optimum tableau, 
we obtain the new tableau shown below: 


Basic 


Coefficients of variables 





/ @i s 

variables 

X] 

X2 

yi 


-/ 

•Si 

bi 

for a is > 0 

XI 

1 

0 

11 

36 

1 

36 

0 

0 

11 

2 


x 2 

0 

1 

1 

12 

1 

12 

0 

0 

9 

2 


-f 

0 

0 

7 

12 

5 

12 

1 

0 

69 

2 


S\ 

0 

0 

1 1 

36 

1 

36 

0 

1 

1 

2 



Step 3: Apply the dual simplex method to find a new optimum solution. For this, we 
select the pivotal row r such that b, — min(*, < 0) — — ^ corresponding to ,v | 
in this case. The first column s is selected such that 

c s . ( Cj \ 

_ — _min I _ I 

tt-rs a rj < h \ tlrj ) 
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Here 


7 

36 

21 

— 

x — = 

— 

12 

11 

11 

5 

36 



x — = 

= 15 

12 

1 



for column yi 


for column V 2 - 


Since yy is minimum out of yj- and 15, the pivot element will be — The 
result of pivot operation is given in the following tableau: 


Basic 


Coefficients 

of variables 




bj / &■ is 

variables 

X\ 

Xl 

yi 

yi 

-/ 

Si 

bi 

for a is > 0 

Xl 

1 

0 

0 

0 

0 

1 

5 
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0 
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0 

1 

11 

0 

3 

11 

51 

11 



0 

0 

0 

4 

11 

l 

21 

11 

369 

11 


yi 

0 

0 

1 

1 

11 

0 

36 

11 

18 

11 



The solution given by the present tableau is x\ —5, xi = 4-jy, yi = 1-jy, and 
/ = — 3 3 -py , in which some variables are still nonintegers. 

Step 4: Generate a new Gornory constraint. To generate the new Gornory constraint, 
we arbitrarily select X 2 as the variable having the largest fractional value (since 
there is a tie between X 2 and yi). The row corresponding to X 2 gives 

*2 - IT “ TT» + tt^i 

From this equation, the Gornory constraint [Eq. (10.9)] can be written as 

5 2 - it + n 5 i - — it 

When this constraint is added to the previous tableau, we obtain the following 
tableau: 


Basic 


Coefficients of variables 






variables 

Xl 

*2 

yi 

yi 

-/ 

Si 

S2 

bi 
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0 

0 

0 

1 

0 

5 
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11 

0 

3 

11 

0 

51 

11 
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0 

0 

1 

1 

11 

0 

36 

1 1 

0 

18 

1 1 


0 

0 

0 

4 

11 

1 

21 
1 1 

0 

369 

11 

si 

0 

0 

0 

1 

11 

0 

3 

1 1 

1 

7 

11 


Step 5: Apply the dual simplex method to find a new optimum solution. To carry the 
pivot operation, the pivot row is selected to correspond to the most negative 
value of bj. This is the sj row in this case. 


10.3 Gomory’s Cutting Plane Method 599 


Since only a r j corresponding to column >’2 is negative, the pivot element 
will be — -jj in the S 2 row. The pivot operation on this element leads to the 
following tableau: 


Basic 

variables 

Coefficients of variables 
x\ * 2 yi yi 

-f 

Si 

si 

bi 

Xi 

1 

0 

0 

0 

0 

1 

0 

5 

X2 

0 

1 

0 

0 

0 

0 

1 

4 

yi 

0 

0 

1 

0 

0 

-3 

1 

1 
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0 

0 

0 

0 

1 

3 

4 

31 

yi 

0 

0 

0 

1 

0 

-3 

-11 

7 


The solution given by this tableau is xi — 5, xi_ — 4, y\ = 1, y 2 = 7, and 
/ = —31, which can be seen to satisfy the integer requirement. Hence this is 
the desired solution. 


10.3.3 G omory's M ethod for M ixed-l nteger Programming Problems 

The method discussed in Section 10.3.2 is applicable to solve all integer programming 
problems where both the decision and slack variables are restricted to integer values in 
the optimal solution. In the mixed-integer programming problems, only a subset of the 
decision and slack variables are restricted to integer values. The procedure for solving 
mixed-integer programming problems is similar to that of all-integer programming 
problems in many respects. 


Solution Procedure. As in the case of an all-integer programming problem, the first 
step involved in the solution of a mixed-integer programming problem is to obtain an 
optimal solution of the ordinary LP problem without considering the integer restrictions. 
If the values of the basic variables, which were restricted to integer values, happen to 
be integers in this optimal solution, there is nothing more to be done. Otherwise, a 
Gornory constraint is formulated by taking the integer-restricted basic variable, which 
has the largest fractional value in the optimal solution of the ordinary LP problem. 

Let Xi be the basic variable that has the largest fractional value in the optimal 
solution (as shown in Table 10.2), although it is restricted to take on only integer 
values. If the nonbasic variables are denoted as yj, j — 1.2, , n, the basic variable 
Xi can be expressed as (from Table 10.2) 


We can write 


Xi = bi - J2 a ‘jyj 
2 = 1 


b/ = bi + ^ 


( 10 . 2 ) 


(10.3) 


where fa is the integer obtained by truncating the fractional part of /;, and fa is the 
fractional part of bi. By defining 

dij = a+j + fly 


(10.10) 
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where 


, fay 

if 

o 

Al 

l<3~ 


“J=jo 

if 

»l 

A 

o 

(10.11) 

1 0 

if 

a ij — 0 


= h 

if 

Ol 

A 

o 

(10.12) 


Eq. (10.2) can be rewritten as 

n 

X ( a ij + a ij^ y j = P' + ~ X ‘ ~ > (10.13) 

1=1 

Here, by assumption, x, is restricted to integer values while /;, is not an integer. Since 
0 < ft, < I and b; is an integer, we can have the value of pj + (/),■ — x,) either > 0 or 
< 0. First, we consider the case where 


pi + (bj -xt)> 0 


(10.14) 


In this case, in order for x\ to be an integer, we must have 

Pi + (bi - Xi) = Pj or Pi + 1 or Pi +2, ... 


Thus Eq. (10.13) gives 


n 

X (a 0 + a ii> y 3 - & 

1=1 

Since a,-y are nonpositive and yj are nonnegative by definition, we have 

n n 

X </•'■/ >X ( </ -‘'n )y i 

i=i 1=1 


(10.15) 


(10.16) 


(10.17) 


and hence 


J2 a iJ y J ~p< 

1=1 


Next, we consider the case where 


Pi + (bi -xt) < 0 


(10.18) 


(10.19) 


For xt to be an integer, we must have (since 0 < Pi < 1) 


Pi + (bi — Xj) — — 1 + Pi or — 2 + Pj or — 3 + . 


(10.20) 
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Thus Eq. (10.13) yields 


J2 (a+ + a^y, < /S, - 1 (10.21) 

j = i 

Since 

n n 

7=1 7=1 

we obtain 


X] X/ < A - 1 (10-22) 

7 = 1 

Upon dividing this inequality by the negative quantity (/!,- — 1), we obtain 

f L t£'w - 1 (1023) 

P ‘ 7 = 1 

Multiplying both sides of this inequality by /f; > 0, we can write the inequality 
(10.23) as 


( 10 - 24 ) 

Pi ~ l U 

Since one of the inequalities in (10.18) and (10.24) must be satisfied, the following 
inequality must hold true: 


n 


E°y^ + 

7 = 1 


Pi 

Pi~ 1 


E ( a ij)yj 2: Pi 

7 = 1 


(10.25) 


By introducing a slack variable .s,- , we obtain the desired Gomory constraint as 


Si 


" o, " 

7 = 1 Pl 1 7 = 1 


Oijyj - Pi 


(10.26) 


This constraint must be satisfied before the variable Xj becomes an integer. The slack 
variable Sj is not required to be an integer. At the optimal solution of the ordinary LP 
problem (given by Table 10.2), all yj — 0 and hence Eq. (10.26) becomes 


Si — ~Pi — negative 
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which can be seen to be infeasible. Hence the constraint Eq. (10.26) is added at the 
end of Table 10.2, and the dual simplex method applied. This procedure is repeated 
the required number of times until the optimal mixed integer solution is found. 

Discussion. In the derivation of the Gomory constraint, Eq. (10.26), we have not 
made use of the fact that some of the variables (yj ) might be integer variables. We 
notice that any integer value can be added to or subtracted from the coefficient of 
aik(= a^ k + aj~ k ) of an integer variable yy provided that we subtract or add, respectively, 
the same value to x- t in Eq. (10.13), that is, 

n 

Y OijVj + fa'* ± = ft + b t - (Xj =F 8) (10.27) 

j = i 

j^k 

Front Eq. (10.27), the same logic as was used in the derivation of Eqs. (10.18) and 
(10.24) can be used to obtain the same final equation, Eq. (10.26). Of course, the 
coefficients of integer variables will be altered by integer amounts in Eq. (10.26). 
It has been established that to cut the feasible region as much as possible (through the 
Gomory constraint), we have to make the coefficients of integer variables yy as small 
as possible. We can see that the smallest positive coefficient we can have for yj in 
Eq. (10.13) is 

(Xjj — Ctij Ciij 

and the largest negative coefficient as 

1 — djj — 1 — a, j + ciij 

where aij is the integer obtained by truncating the fractional part of a,j and a i; - is the 
fractional part. Thus we have a choice of two expressions, (a ;/ — a (/ - ) and (1 — <7, ; + 
a^), f° r the coefficients of y, in Eq. (10.26). We choose the smaller one out of the 
two to make the Gomory constraint, Eq. (10.26), cut deeper into the original feasible 
space. Thus Eq. (10.26) can be rewritten as 

+ (“‘j ~ 

j 

■ 

for integer variables yj 
and for — aij < fit 


for integer variables yj 
and for ciij — aij > fit 

where the slack variable .q is not restricted to be an integer. 

Example 10.2 Solve the problem of Example 10.1 with %2 only restricted to take 
integer values. 


= T, a uyj + j^T, (+a u ) y. 


for noninterger variables yj 


P‘ 

J2 (1 _ + ou)yj ~ ft 
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SOLUTION 

Step 1: Solve the LP problem by simplex method by neglecting the integer requirement. 
This gives the following optimal tableau: 


Basic 


Coefficients of variables 




variables 

Xl 

x 2 

yi 

y2 

-f 

bt 

XI 

1 

0 

11 

36 

1 

36 

0 

11 

2 

X2 

0 

1 

1 

12 

1 

12 

0 

9 

2 


0 

0 

7 

12 

5 

12 

l 

69 

2 


The noninteger solution given by this tableau is 

Xi = 5i, X 2 = 4±, yi = y 2 = 0, and / min = -34±. 

Step 2: Formulate a Gomory constraint. Since x 2 is the only variable that is restricted 
to take integer values, we construct the Gomory constraint for x 2 . From the 
tableau of step 1, we obtain 


x 2 — b 2 fl 2 i.yi - a 22 y 2 


where 

b 2 = §, a 2l = -yj, and a 22 = yy 

According to Eq. (10.3), we write b 2 as b 2 — b 2 + /3 2 where b 2 — 4 and /3 2 = j. 
Similarly, we write from Eq. (10.10) 

a 2 \ — ~b a^ 
a 22 — ti 22 ~b ^22 

where 

Oji = 0, a^ — —js (since a 2 \ is negative) 

= -L, a^ 2 — 0 (since a 22 is nonnegative) 

The Gomory constraint can be expressed as [from Eq. (10.26)]: 

2 2 

^ #2 ^ 

+ rrrEvi = 

j = i i=i 

where s 2 is a slack variable that is not required to take integer values. By 
substituting the values of a~ : , and yS, , this constraint can be written as 

'] 'J 

^2 + nTt - nyg = 
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When this constraint is added to the tableau above, we obtain the following: 


Basic 


Coefficients of variables 





variables x\ 

*2 

yi 

y2 

-/ 

S2 

bt 

x\ 

l 

0 

11 

36 

1 

36 

0 

0 

11 

2 

X2 

0 

1 

1 

12 

1 

12 

0 

0 

9 

2 

-f 

0 

0 

7 

12 

5 

12 

1 

0 

69 

2 

S2 

0 

0 

1 

12 

1 

12 

0 

1 

1 

2 

Step 3: 

Apply the dual simplex method to find a new optimum solution. 

Since — j is the 


only negative bj term, the pivot operation has to be done in s 2 row. Further, 

a U 


corresponding to y 2 column is the only negative coefficient in s 2 row and hence 


pivoting has to be done on this element. 

1 

12* 

The result of pivot operation is 


shown in the following tableau: 




Basic 


Coefficients of variables 





variables x\ 

*2 

yi 

y2 

-/ 

si 

bi 

Xl 

1 

0 

1 

3 

0 

0 

1 

3 

16 

3 

x 2 

0 

1 

0 

0 

0 

i 

4 


0 

0 

i 

0 

1 

5 

32 

y2 

0 

0 

-l 

1 

0 

-12 

6 


This tableau gives the desired integer solution as 

*i=5±, *2 = 4, V 2 = 6, y\ — 0, s 2 — 0, and / min = - 32 


10.4 BALAS' ALGORITHM FOR ZERO-ONE PROGRAMMING 
PROBLEMS 

When all the variables of a LP problem are constrained to take values of 0 or 1 only, we 
have a zero-one (or binary) LP problem. A study of the various techniques available 
for solving zero-one programming problems is important for the following reasons: 

1. As we shall see later in this chapter (Section 10.5), a certain class of integer 
nonlinear programming problems can be converted into equivalent zero-one 
LP problems, 

2. A wide variety of industrial, management, and engineering problems can be for- 
mulated as zero-one problems. For example, in structural control, the problem 
of selecting optimal locations of actuators (or dampers) can be formulated as a 
zero-one problem. In this case, if a variable is zero or 1, it indicates the absence 
or presence of the actuator, respectively, at a particular location [10.31]. 

The zero-one LP problems can be solved by using any of the general integer LP 
techniques like Gomory’s cutting plane method and Land and Doig’s branch- and-bound 
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method by introducing the additional constraint that all the variables must be less than 
or equal to 1. This additional constraint will restrict each of the variables to take a 
value of either zero (0) or one (1). Since the cutting plane and the branch- and-bound 
algorithms were developed primarily to solve a general integer LP problem, they do not 
take advantage of the special features of zero-one LP problems. Thus several methods 
have been proposed to solve zero-one LP problems more efficiently. In this section 
we present an algorithm developed by Balas (in 1965) for solving LP problems with 
binary variables only [10.9]. 

If there are n binary variables in a problem, an explicit enumeration process will 
involve testing 2" possible solutions against the stated constraints and the objective 
function. In Balas method, all the 2" possible solutions are enumerated, explicitly or 
implicitly. The efficiency of the method arises out of the clever strategy it adopts in 
selecting only a few solutions for explicit enumeration. 

The method starts by setting all the n variables equal to zero and consists of a 
systematic procedure of successively assigning to certain variables the value 1 , in such 
a way that after trying a (small) part of all the 2" possible combinations, one obtains 
either an optimal solution or evidence of the fact that no feasible solution exists. The 
only operations required in the computation are additions and subtractions, and hence 
the round-off errors will not be there. For this reason the method is some times referred 
to as additive algorithm. 


Standard Form of the Problem. To describe the algorithm, consider the following 
form of the LP problem with zero-one variables: 


Find X = 


subject to 


■fi 
%n _ 


such that /(X) = C r X — > minimum 


AX +Y = B 

Xj — 0 or 1 

Y > 0 


where 



c 1 ' 


Vi ' 


h 

c = 

C2 

IV 

© 

-< 

II 


, B = 

bi 


Cn 


. 3^ m . 


b m 


a ii 

a 12 • 

’ * @ln 

«21 

<222 • 

' ' &2n 

Fm\ 

<2 m2 

Q-mn 


(10.28) 


where Y is the vector of slack variables and a and a t] need not be integers. 
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Initial Solution. An initial solution for the problem stated in Eqs. (10.28) can be 
taken as 


/o = 0 

jc, = 0, i = l,2, ...,n (10.29) 

Y (0) = B 

If B > 0, this solution will be feasible and optimal since C > 0 in Eqs. (10.28). In this 
case there is nothing more to be done as the starting solution itself happens to be opti- 
mal. On the other hand, if some of the components bj are negative, the solution given 
by Eqs. (10.29) will be optimal (since C > 0) but infeasible. Thus the method starts 
with an optimal (actually better than optimal) and infeasible solution. The algorithm 
forces this solution toward feasibility while keeping it optimal all the time. This is the 
reason why Balas called his method the pseudo dual simplex method. The word pseudo 
has been used since the method is similar to the dual simplex method only as far as 
the starting solution is concerned and the subsequent procedure has no similarity at all 
with the dual simplex method. The details can be found in Ref. [10.9]. 
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10.5 INTEGER POLYNOMIAL PROGRAMMING 


Watters [10.2] has developed a procedure for converting integer polynomial program- 
ming problems to zero-one LP problems. The resulting zero-one LP problem can 
be solved conveniently by the Balas method discussed in Section 10.4. Consider the 
optimization problem: 


Find X = 


xi 
x 2 


which minimizes /(X) 


subject to the constraints (10.30) 

g;(X)<0, j — 1,2, ... ,m 
Xj — integer, i = 1,2, ... ,n 

where / and gj, j = 1,2, ... , m, are polynomials in the variables x\, X 2 , ■ ■ ■ , x n . A 
typical term in the polynomials can be represented as 

nk 

do.31) 

Z=1 

where c* is a constant, a^i a nonnegative constant exponent, and n* the number 
of variables appearing in the £th term. We shall convert the integer polynomial 
programming problem stated in Eq. (10.30) into an equivalent zero-one LP problem 
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in two stages. In the first stage we see how an integer variable, x,-, can be represented 
by an equivalent system of zero-one (binary) variables. We consider the conversion 
of a zero-one polynomial programming problem into a zero-one LP problem in the 
second stage. 


10.5.1 Representation of an Integer Variable by an Equivalent System 
of Binary Variables 

Let Xi be any integer variable whose upper bound is given by m, so that 

Xi < Ui < oo (10.32) 

We assume that the value of the upper bound m can be determined from the constraints 
of the given problem. 

We know that in the decimal number system, an integer p is represented as 

P = Po + 10Vi + I0 2 p 2 H , 0 < pi < (10 - 1 = 9) 

for i — 0, 1, 2 , . . . 


and written as p — ■ ■ ■ p 2 P\po by neglecting the zeros to the left. For example, we write 
the number p = 008076 as 8076 to represent p — 6 + (10 1 )7 + (10 2 )(0) + (10 3 )8 + 
(10 4 )0 + (10 5 )0. In a similar manner, the integer p can also be represented in binary 
number system as 

P — Qo + 2 1 z/i + 2~q2 + 2 3 q 3 + • ■ • 

where 0 < g, < (2—1 = 1) for i = 0, 1, 2, 

In general, if y. 0) , yf 1 , , y ; (2 \ . . . denote binary numbers (which can take a value of 
0 or 1), the variable x, can be expressed as 


Xi 


Ni 
k = 0 


where N, is the smallest integer such that 

Ui + l < 2 n > 

2 “ 


(10.33) 


(10.34) 


Thus the value of Nj can be selected for any integer variable jq once its upper bound 
Uj is known. For example, for the number 97, we can take m, = 97 and hence the 
relation 


Uj + 1 
2 


98 


= 49 < 2 


Ni 


is satisfied for Nj > 6. Hence by taking /V, = 6, we can represent m, as 


97 — qo + 2 l q\ + 2 “q2 + 2 3 qj, + 2 4 <?4 + 2 5 q$ + 2 6 q& 


where qo — 1, q\ — q 2 — q?, — <74 = 0, and q 5 — q 6 = 1. A systematic method of find- 
ing the values of qo, q\, q 2 , ■ ■ ■ is given below. 
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Method of Finding qo, q\ , < 72 , . . .. Let M be the given positive integer. To find its 
binary representation q„q n - 1 . . . q\qo, we compute the following recursively: 

b 0 = M (10.35) 

, bo - qo 


bk-l — <lk- 1 


where q k = 1 if b k is odd and q k — 0 il’ bk is even. The procedure terminates when 
b k = 0 . 

Equation (10.33) guarantees that x- t can take any feasible integer value less than 
or equal to n ; . The use of Eq. (10.33) in the problem stated in Eq. (10.30) will convert 
the integer programming problem into a binary one automatically. The only difference 
is that the binary problem will have N\ + N 2 + ■ ■ ■ + N n zero-one variables instead 
of the n original integer variables. 


10.5.2 Conversion of a Zero- One Polynomial Programming Problem 
into a Zero- One LP Problem 

The conversion of a polynomial programming problem into a LP problem is based on 
the fact that 


x“ ki =xt (10.36) 

if Xi is a binary variable (0 or 1) and cq, is a positive exponent. If a k i — 0, then 
obviously the variable x, will not be present in the Arth term. The use of Eq. (10.36) 
permits us to write the kth term of the polynomial, Eq. (10.31), as 


n k 

Ck Y\(xi) akl = Ck Y[xi = Ck(x 1 , x 2 , • • • , X„ k ) (10.37) 

1=1 1=1 


Since each of the variables x\, X 2 , ... can take a value of either 0 or 1, the product 
(x\X 2 ■ ■ ■ x n k) also will take a value of 0 or 1. Hence by defining a binary variable 
Vk as 


nk 

y k = X!X 2 ■ ■■ Xnk = Y\ X ' 
1=1 


(10.38) 
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the kth term of the polynomial simply becomes c k \'k ■ However, we need to add the 
following constraints to ensure that y k — 1 when all x, = 1 and zero otherwise: 


y k > 



yk 


l 

n k 



(10.39) 

(10.40) 


It can be seen that if all x, — 1, Y^iLi x i = n k> and Eqs. (10.39) and (10.40) yield 

y k > 1 (10.41) 

y k < 1 (10.42) 

which can be satisfied only if y k = 1. If at least one x, = 0, we have YTiLi x i < n k , 
and Eqs. (10.39) and (10.40) give 


y k > -(n k - 1) 
y k < 1 


(10.43) 

(10.44) 


Since n k is a positive integer, the only way to satisfy Eqs. (10.43) and (10.44) under 
all circumstances is to have y k — 0. 

This procedure of converting an integer polynomial programming problem into an 
equivalent zero-one LP problem can always be applied, at least in theory. 


10.6 BRANCH-AND-BOUND METHOD 

The branch-and-bound method is very effective in solving mixed-integer linear and 
nonlinear programming problems. The method was originally developed by Land and 
Doig [10.8] to solve integer linear programming problems and was later modified 
by Dakin [10.23]. Subsequently, the method has been extended to solve nonlinear 
mixed-integer programming problems. To see the basic solution procedure, consider 
the following nonlinear mixed-integer programming problem: 


Minimize/ (X) 

(10.45) 

gj(X) >o, 

j — 1,2 , ,m 

(10.46) 

h k (X) = 0, 

k = 1, 2, . . . , p 

(10.47) 

= integer, j 

= 1,2 , . . . , n 0 («o < n) 

(10.48) 


where X = [xi, % 2 , • • • , x n } T . Note that in the design vector X, the first hq variables 
are identified as the integer variables. If no = n, the problem becomes an all-integer 
programming problem. A design vector X is called a continuous feasible solution if 
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X satisfies constraints (10.46) and (10.47). A design vector X that satisfies all the 
constraints, Eqs. (10.46) to (10.48), is called an integer feasible solution. 

The simplest method of solving an integer optimization problem involves enumer- 
ating all integer points, discarding infeasible ones, evaluating the objective function 
at all integer feasible points, and identifying the point that has the best objective 
function value. Although such an exhaustive search in the solution space is simple 
to implement, it will be computationally expensive even for moderate-size problems. 
The branch-and-bound method can be considered as a refined enumeration method in 
which most of the nonpromising integer points are discarded without testing them. Also 
note that the process of complete enumeration can be used only if the problem is an 
all-integer programming problem. For mixed-integer problems in which one or more 
variables may assume continuous values, the process of complete enumeration cannot 
be used. 

In the branch-and-bound method, the integer problem is not directly solved. Rather, 
the method first solves a continuous problem obtained by relaxing the integer restric- 
tions on the variables. If the solution of the continuous problem happens to be an 
integer solution, it represents the optimum solution of the integer problem. Otherwise, 
at least one of the integer variables, say Xj, must assume a nonintegral value. If x, is 
not an integer, we can always find an integer [x, J such that 


U,J < Xi < [x,\ + 1 (10.49) 

Then two subproblems are formulated, one with the additional upper bound 
constraint 


Xi < [Xi] (10.50) 

and another with the lower bound constraint 

*/>[*,-] + 1 (10.51) 


The process of finding these subproblems is called branching. 

The branching process eliminates some portion of the continuous space that is not 
feasible for the integer problem, while ensuring that none of the integer feasible solu- 
tions are eliminated. Each of these two subproblems are solved again as a continuous 
problem. It can be seen that the solution of a continuous problem forms a node and 
from each node two branches may originate. 

The process of branching and solving a sequence of continuous problems discussed 
above is continued until an integer feasible solution is found for one of the two con- 
tinuous problems. When such a feasible integer solution is found, the corresponding 
value of the objective function becomes an upper bound on the minimum value of the 
objective function. At this stage we can eliminate from further consideration all the 
continuous solutions (nodes) whose objective function values are larger than the upper 
bound. The nodes that are eliminated are said to have been fathomed because it is 
not possible to find a better integer solution from these nodes (solution spaces) than 
what we have now. The value of the upper bound on the objective function is updated 
whenever a better bound is obtained. 
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It can be seen that a node can be fathomed if any of the following conditions 
are true: 

1. The continuous solution is an integer feasible solution. 

2 . The problem does not have a continuous feasible solution. 

3 . The optimal value of the continuous problem is larger than the current upper 
bound. 

The algorithm continues to select a node for further branching until all the nodes have 
been fathomed. At that stage, the particular fathomed node that has the integer feasible 
solution with the lowest value of the objective function gives the optimum solution of 
the original nonlinear integer programming problem. 

Example 10.3 Solve the following LP problem using the branch-and-bound method: 

Maximize f = 3x\ + 4 x 2 

subject to (Ei) 

lx\ + 1 1 x 2 < 88, 3xi — X 2 < 12, xi > 0, X 2 > 0 

x, = integer, ( = 1.2 (E 2 ) 

SOLUTION The various steps of the procedure are illustrated using graphical method. 


Step 1: First the problem is solved as a continuous variable problem [without Eq. (E 2 )] 
to obtain: 

Problem (Ej) : Fig. 10.2; (x* = 5.5, xf = 4.5, /* = 34.5) 

Step 2: The branching process, with integer bounds on xi, yields the problems: 

Maximize / = 3xi + 4 x 2 

subject to (E 3 ) 

7xi + 1 1 .X 2 < 88 , 3xi — X 2 < 12, xi < 5, X 2 > 0 

and 


Maximize / = 3xi + 4 x 2 

subject to (E 4 ) 

7xi + 1 1x2 < 88, 3xi — X 2 < 12, xi > 6, X 2 > 0 

The solutions of problems (E 3 ) and (E 4 ) are given by 


Problem (E 3 ) : Fig. 10.4; (xjf = 5,x| = 4.8182, f* = 34.2727) 
Problem (E 4 ) : Fig. 10.5; no feasible solution exists. 
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X2 



Step 3 : The next branching process, with integer bounds on X2, leads to the following 
problems: 

Maximize / = 3 x\ + 4x2 

subject to (E5) 

7 xi + II.X2 < 88, 3 xi — X2 < 12 , x\ < 5 , X2 < 4 

and 


Maximize / = 3 a'| + 4a _ 2 

subject to (Eg) 

7 xi + 11x2 < 88, 3 xi — X2 < 12 , x\ < 5 , X2 > 5 
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*2 



Figure 10.5 Graphical solution of problem (E 4 ). 


The solutions of problems (E5) and (Eg) are given by 

Problem (E 5 ) : Fig. 10.6; (x* = 5, x\ = 4, f* = 31) 

Problem (Eg) : Fig. 10.7; (x* = 0, x% = 8, /* = 32) 

Since both the variables assumed integer values, the optimum solution of the 
integer LP problem, Eqs. (Ej ) and (Eo), is given by (x* = 0, x* — 8, /* = 32). 

Example 10.4 Find the solution of the welded beam problem of Section 7.22.3 by 

treating it as a mixed-integer nonlinear programming problem by requiring X 3 and X 4 
to take integer values. 

SOLUTION The solution of this problem using the branch- and-bound method was 
reported in Ref. [10.25]. The optimum solution of the continuous variable nonlinear 
programming problem is given by 

X* = {0.24, 6.22, 8.29, 0.24} T , /* = 2.38 
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Next, the branching problems, with integer bounds on * 3 , are solved and the pro- 
cedure is continued until the desired optimum solution is found. The results are shown 
in Fig. 10.8. 


10.7 SEQUENTIAL LINEAR DISCRETE PROGRAMMING 

Let the nonlinear programming problem with discrete variables be stated as follows: 


Minimize /(X) (10.52) 

subject to 

Sj(X)<0, j = 1, 2, . . . , m (10.53) 

h k (X) = 0, k = l,2,...,p (10.54) 

Xj e {dn, di 2 , . . . , dj q }, i — 1, 2, . . . , hq (10.55) 

x- l} < Xi < x^ u \ i = no + 1, no + 2, ... , n (10.56) 
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X2 



where the first «o design variables are assumed to be discrete, d, j is the j th discrete 
value for the variable i, and X = {x\, X 2 , . ■ . , x„} T . It is possible to find the solution 
of this problem by solving a series of mixed-integer linear programming problems. 

The nonlinear expressions in Eqs. (10.52) to (10.54) are linearized about a point 
X° using a first-order Taylor’s series expansion and the problem is stated as 



Minimize /(X) 

f(X°) + Vf(X°)SX 

(10.57) 

s,-(X) 

™gj(X°) + Vgj(X 

°)SX < 0, j — 1,2, , m 

(10.58) 

h k (X) 

» Ajt(X°) + V/t,(X 

°)<5X = 0, jfc = l,2,...,p 

(10.59) 


+ SXi 6 {dj I , . . 

. , dig], i = 1,2 ,...,n 0 

(10.60) 


< xf + SXi < x- u) , 

i — no + 1, hq + 2, n 

(10.61) 


SX = 

o 

X 

1 

X 

(10.62) 


subject to 
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Figure 10.8 Solution of the welded beam problem using branch-and-bound method. [10.25] 

The problem stated in Eqs. (10.57) to (10.62) cannot be solved using mixed-integer 
linear programming techniques since some of the design variables are discrete and 
noninteger. The discrete variables are redefined as [10.26] 


Xi = yndn + y l2 dn -\ h yi q d lq = ^ yijdtj, i = 1,2 ,...,n 0 

j = i 


with 


(10.63) 


yn + yn + • ■ ■ + yi q = X! = 1 
7=1 

yij = 0 or 1, i = 1,2, , n 0 , j — 1 , 2, . . . , q 
Using Eqs. (10.63) to (10.65) in Eqs. (10.57) to (10.62), we obtain 


n o 


Minimize /(X) ~ /(X°) + ^ j- I ^ y lJ d l 


i=i 


dx; 




+ E 


i=«0+l 


dx; 


(10.64) 

(10.65) 


( 10 . 66 ) 
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subject to 



j — 1,2, ... ,m 


(10.67) 



k — 1,2 , .... p 


(10.68) 


7,yij = 1, i = 1 , 2, . . . , no 


(10.69) 


7=1 

y,j — 0 or 1 , i — 1,2, ... ,no, j = 1 . 2, . . . , q 
x^ < x f + Sxj < xf‘\ i — iiq + 1, «o + 2 , . . . , n 


(10.70) 


(10.71) 


The problem stated in Eqs. (10.66) to (10.71) can now be solved as a mixed-integer 
LP problem by treating both y t j{i = 1, 2, . . . , no, j = 1,2 , ,q) and x,- (/ = «o + 1, 
«o + 2, . . . , n) as unknowns. 

In practical implementation, the initial linearization point X° is to be selected 
carefully. In many cases the solution of the discrete problem is expected to lie in 
the vicinity of the continuous optimum. Hence the original problem can be solved as 
a continuous nonlinear programming problem (by ignoring the discrete nature of the 
variables) using any of the standard nonlinear programming techniques. If the resulting 
continuous optimum solution happens to be a feasible discrete solution, it can be used as 
X°. Otherwise, the values of Xj from the continuous optimum solution are rounded (in 
a direction away from constraint violation) to obtain an initial feasible discrete solution 
X°. Once the first linearized discrete problem is solved, the subsequent linearizations 
can be made using the result of the previous optimization problem. 

Example 10.5 [10.26] 

Minimize /(X) = 2x\ + 3x\ 

subject to 



X\ X2 


xi e {0.3, 0.7, 0.8. 1.2, 1.5, 1.8} 
x 2 e {0.4, 0.8, 1.1, 1.4, 1.6} 


SOLUTION In this example, the set of discrete values of each variable is truncated 
by allowing only three values — its current value, the adjacent higher value, and the 
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adjacent lower value — for simplifying the computations. Using X° = {} [}, we have 


Now 


V/(X°) 


/(X°) = 6.51, g(X°) 


[4xi 


{4.8 

1 6x 2 

x° 

- (6.6 


Vg(X°) 


-2.26 

1 



xi = yi i (0.8) + yi 2 (l-2) + Nt3 ( 1 -5) 
*2 = >"21 (0.8) + y22(l-l) + >23(1-4) 


I -0.69 
(-0.83 


8x, = >n(0.8 - 1.2) + > 12 (1.2 - 1.2) + > 13 (1.5 - 1.2) 
5X2 = >21(0.8- 1.1) + >22(1.1 - 1.1) + >23(1-4- 1.1) 


/ ^6.51 + {4.8 6.6} 


J — 0.4 yi i + 0.3yi3 
(— 0.3y 2 i + 0.3y 2 3 


-2.26 + {-0.69 -0.83} 


| — 0.4yn + 0.3yi3 
|-0.3y 2 i + 0.3.V23 


Thus the first approximate problem becomes (in terms of the unknowns y\\, >’ 12 , > 13 , 
> 21 , > 22 , and y 2 3 ): 


Minimize / = 6.51 — 1.92yn + 1 .44 V 13 — 1.98y 2 i + l-98y 2 3 


subject to 

— 2.26 -f- 0.28yn 3- 0.21yi3 -T 0.25_y 2 i — 0.25 y 2 3 ^ 0 
>11 + >12 + >13 = 1 
>21 + >22 + >23 = 1 
y t j — 0 or 1, £ = 1,2, 7 = 1,2,3 

In this problem, there are only nine possible solutions and hence they can all be 
enumerated and the optimum solution can be found as 


>ii = l- >12 = 0, yi3 =0, >21 = 1- >22 = 0, >23 = 0 

Thus the solution of the first approximate problem, in terms of original variables, is 
given by 

x\ — 0.8, x 2 = 0.8, /(X) = 2.61, and g(X) = — 1.5 

This point can be used to generate a second approximate problem and the process can 
be repeated until the final optimum solution is found. 
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10.8 GENERALIZED PENALTY FUNCTION METHOD 

The solution of an integer nonlinear programming problem, based on the concept of 
penalty functions, was originally suggested by Gellatly and Marcal in 1967 [10.5]. 
This approach was later applied by Gisvold and Moe [10.4] and Shin et al. [10.24] 
to solve some design problems that have been formulated as nonlinear mixed-integer 
programming problems. The method can be considered as an extension of the interior 
penalty function approach considered in Section 7.13. To see the details of the approach, 
let the problem be stated as follows: 


FindX = 


*t 

*2 


subject to the constraints 



which minimizes /(X) 


S,-(X) >0, j = 1,2, ...,m 

X c € Sc and X^ g 


(10.72) 


where the vector of variables (X) is composed of two vectors X,/ and X c , with X,/ 
representing the set of integer variables and X c representing the set of continuous 
variables. Notice that X, will not be there if all the variables are constrained to take 
only integer values and X,/ will not be there if none of the variables is restricted to 
take only integer values. The sets S c and S,j denote the feasible sets of continuous and 
integer variables, respectively. To extend the interior penalty function approach to solve 
the present problem, given by Eq. (10.72), we first define the following transformed 
problem: 

Minimize <pt (X , r k , s k ) 


where 


<t>k{*,r k ,s k ) = f(X) + r k J2Gj[gjm+s k Q k (X d ) (10.73) 

j = i 

In this equation, r k is a weighing factor (penalty parameter) and 

m 

r k J2 G jlgj(X)\ 

7= i 

is the contribution of the constraints to the 4> k function, which can be taken as 

m m j 

n Gj [gj (X )] = + n £ — — (10.74) 

7=1 7=1 gj{) 

It can be noted that this term is positive for all X satisfying the relations g ; (X) > 0 and 
tends to +oo if any one particular constraint tends to have a value of zero. This property 
ensures that once the minimization of the cf> k function is started from a feasible point, 
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the point always remains in the feasible region. The term SkQk(Xd) can be considered 
as a penalty term with s k playing the role of a weighing factor (penalty parameter). The 
function Qk(Xd) is constructed so as to give a penalty whenever some of the variables 
in X f / take values other than integer values. Thus the function Qk (X,/) has the property 
that 


Qk(*d) = 


| 0 if 
| fi > 0 if 


Xd £ Sd 

Xd i Sd 


We can take, for example, 


Gt(X rf ) 




(10.75) 


(10.76) 


where y, < x, , Zi > x,- , and fa >1 is a constant. Here y,- and z, t are the two neighbor- 
ing integer values for the value x,. The function Qk(Xd ) is a normalized, symmetric 
beta function integrand. The variation of each of the terms under summation sign in 
Eq. (10.76) for different values of fa is shown in Fig. 10.9. The value of fa has to be 
greater than or equal to 1 if the function Q k is to be continuous in its first derivative 
over the discretization or integer points. 

The use of the penalty term defined by Eq. (10.76) makes it possible to change 
the shape of the fa function by changing fa, while the amplitude can be controlled by 
the weighting factor s k . The fa function given in Eq. (10.73) is now minimized for a 
sequence of values of r k and s k such that for k oc, we obtain 


Min fa(X,r k , s k ) -* Min /(X) 

2;(X)>0, j — 1,2, ... ,m (10.77) 

QkQXd) -> 0 


In most of the practical problems, one can obtain a reasonably good solution by carrying 
out the minimization of cf>k even for 5 to 10 values of k. The method is illustrated in 
Fig. 10.10 in the case of a single- variable problem. It can be noticed from Fig. 10.10 
that the shape of the 0 function (also called the response function) depends strongly 
on the numerical values of r k , Sk, and fa. 



Figure 10.9 Contour of typical term in Eq. (10.62) [10.4], 
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Figure 10.10 Solution of a single-variable integer problem by penalty function method. x\, 
discrete variable; xf y'th value of x\ [10.4]. 


Choice of the Initial Values of r*, s k , and fa. The numerical values of r k , Sk, and 
fa have to be chosen carefully to achieve fast convergence. If these values are chosen 
such that they give the response surfaces of <p function as shown in Fig. 10.10c, several 
local minima will be introduced and the risk in finding the global minimum point will 
be more. Hence the initial value of s k (namely, ,V| ) is to be chosen sufficiently small 
to yield a unimodal response surface. This can be achieved by setting 


SkQ' k « P[ (10.78) 

where Q' k is an estimate of the maximum magnitude of the gradient to the Q k surface 
and P' k is a measure of the gradient of the function P k defined by 

m 

P k = f(X) + r k J2Gj[gj&)] (10.79) 

j = i 

Gisvold and Moe [10.4] have taken the values of Q[ and P' k as 

Qk = 5 • ^ k faifa - D Pk ~ l (2fa - D'/z-A (10.80) 

p ,_ ^ p fvp k \ l/2 


n 


(10.81) 
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where 


VPr 


dPk/dx ft 
dP k /dx 2 


dP k /dx n 


(10.82) 


The initial value of si, according to the requirement of Eq. (10.78), is given by 


si = Cl 


ft'(X i,n) 
G'ftX'^ft) 


(10.83) 


where X] is the initial starting point for the minimization of ft , X ( | ' / l the set of starting 
values of integer-restricted variables, and c i a constant whose value is generally taken 
in the range 0.001 and 0.1. 

To choose the weighting factor r\, the same consideration as discussed in 
Section 7.13 are to be taken into account. Accordingly, the value of r\ is chosen as 


n = c 2 


/(X t) 

+ £”=i l/S;(Xi) 


(10.84) 


with the value of c 2 ranging between 0.1 and 1.0. Finally, the parameter ft must be 
taken greater than 1 to maintain the continuity of the first derivative of the function ft. 
over the discretization points. Although no systematic study has been conducted to find 
the effect of choosing different values for ft, the value of ft ~ 2.2 has been found to 
give satisfactory convergence in some of the design problems. 

Once the initial values of ?>, Sk, and ft (for k — 1) are chosen, the subsequent 
values also have to be chosen carefully based on the numerical results obtained on 
similar formulations. The sequence of values are usually determined by using the 
relation 


n+ 1 = cm, k — 1,2 ,... 


(10.85) 


where C3 < 1. Generally, the value of C3 is taken in the range 0.05 to 0.5. To select the 
values of Sk, we first notice that the effect of the term ftftXft is somewhat similar to 
that of an equality constraint. Hence the method used in finding the weighting factors 
for equality constraints can be used to find the factor Sk+ \ . For equality constraints, 
we use 


Sk+l 

Sk 


d/2 


ft / 2 

T k + 1 


From Eqs. (10.85) and (10.86), we can take 


( 10 . 86 ) 


Sk+l — C4Sk 


(10.87) 


with C4 approximately lying in the range ft I /0.5 and ftl/0.05 (i.e., 1.4 and 4.5). The 
values of ft can be selected according to the relation 

ft+i = c 5 ft (10.88) 


with C5 lying in the range 0.7 to 0.9. 


10.8 Generalized Penalty Function Method 623 



A general convergence proof of the penalty function method, including the integer 
programming problems, was given by Fiacco [10.6]. Flence the present method is 
guaranteed to converge at least to a local minimum if the recovery procedure is applied 
the required number of times. 


Example 10.6 [10.24] Find the minimum weight design of the three-bar truss shown 
in Fig. 10.11 with constraints on the stresses induced in the members. Treat the areas 
of cross section of the members as discrete variables with permissible values of the 
parameter A,cr max /P given by 0.1, 0.2, 0.3, 0.5, 0.8, 1.0, and 1.2. 


SOLUTION By defining the nondimensional quantities / and X; as 


IT CTmax 
Ppl ’ 


Xi 


/l / r^niax 

P 


i = 1,2,3 


where W is the weight of the truss, er max the permissible (absolute) value of stress, 
P the load, p the density, / the depth, and A, the area of cross section of member 
i(i = 1,2, 3), the discrete optimization problem can be stated as follows: 


Minimize / = 2x\ + X 2 + \/2x3 


subject to 


gi(X) = 1 
g 2 (X) = 1 


V3x 2 + 1.932x3 
1.5xiX2 + V2.X2X3 + 1.319xiX3 
0.634xi + 2.828x3 
1.5xiX2 + V 2 X 2 X 3 + 1.319X1X3 
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g3(X) = 
g4(X) = 


0.5xi — 2x2 

1.5xiX2 + ^2X2X3 + 1.319X1X3 
0.5xi — 2x2 

1.5xiX2 + \flx2XT, + 1.319xiX3 


> 0 


> 0 


x, e {0.1, 0.2, 0.3, 0.5, 0.8, 1.0, 1.2}, i = 1.2,3 


The optimum solution of the continuous variable problem is given by f* = 2.7336, 
x* = 1.1549, x* =0.4232, and x| =0.0004. The optimum solution of the discrete 
variable problem is given by f* = 3.0414, xj* = 1.2, x| = 0.5, and x| = 0.1. 


10.9 SOLUTION OF BINARY PROGRAMMING PROBLEMS USING 
MATLAB 

The MATLAB function bintprog can be used to solve a binary (or zero-one) pro- 
gramming problem. The following example illustrates the procedure. 

Example 10.7 Find the solution of the following binary programming problem using 
the MATLAB function bintprog: 

Minimize /(X) = — 5xi — 5x2 ~ 8*3 + 4x4 + 4x5 


subject to 


3xi — 6x2 + 7X3 ~ 9x4 — 9X5 < — 10, X] + 2x2 — X4 — 3x5 < 0 
Xj binary; i = 1, 2, 3, 4. 5 


SOLUTION 

Step 1: State the problem in the form required by the program bintprog: 
Minimize /(x) = / T x subject to Ax < b and A eq x = b eq 


Here 


/ T = {—5 —5 —8 2 4}, x = {xi X2 X3 X4 X5 } t 



'3-67-9-9' 

f -101 

A = 


, b = 


1 2 0 -1 -3 _ 

1 o 


Step 2: The input is directly typed on the MATLAB command window and the program 
bintprog is called as indicated below: 

clear; clc; 
f = [-5 -5 -8 2 4]'; 

A = [3-67-9 -9; 120-1 -3]; 
b = [-10 0]'; 

x = bintprog (f, A, b, []) 
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Step 3: The output of the program is shown below: 


Optimization terminated, 
x = 

1 

1 

1 

1 

1 
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REVIEW QUESTIONS 

10.1 Answer true or false: 

(a) The integer and discrete programming problems are one and the same. 

(b) Gomory’s cutting plane method is applicable to mixed-integer programming 
problems. 

(c) The Balas method was developed for the solution of all-integer programming 
problems. 

(d) The branch-and-bound method can be used to solve zero-one programming 
problems. 

(e) The branch-and-bound method is applicable to nonlinear integer programming 
problems. 
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10.2 Define the following terms: 

(a) Cutting plane 

(b) Gomory’s constraint 

(c) Mixed-integer programming problem 

(d) Additive algorithm 

10.3 Give two engineering examples of a discrete programming problem. 

10.4 Name two engineering systems for which zero-one programming is applicable. 

10.5 What are the disadvantages of truncating the fractional part of a continuous solution for 
an integer problem? 

10.6 How can you solve an integer nonlinear programming problem? 

10.7 What is a branch-and-bound method? 

10.8 Match the following methods: 

(a) Land and Doig Cutting plane method 

(b) Gomory Zero-one programming method 

(c) Balas Generalized penalty function method 

(d) Gisvold and Moe Branch-and-bound method 

(e) Reiter and Rice Generalized quadratic programming method 


PROBLEMS 


Find the solution for Problems 10.1-10.5 using a graphical procedure. 

10.1 Minimize / = 4x\ + 5 .V 2 

subject to 

3xi + xt > 2 
xi + 4x2 > 5 
3xi + 2x2 > 7 

XI, X 2 > 0, integers 

10.2 Maximize / = 4xq 4 - 8 x 2 
subject to 

4xi + 5x2 < 40 
xi + 2 x 2 < 12 


xi , X 2 > 0 , integers 
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10.3 

Maximize / = 4*1 + 3x2 

subject to 

3xi + 2x2 <18 

xi, X 2 > 0, integers 

10.4 

Maximize / = 3xi — X 2 

subject to 

3xi — 2x2 < 3 
— 5xj — 4x2 < — 10 

xi,X 2 > 0, integers 

10.5 

Maximize / = 2xi + X 2 

subject to 

8xi + 5x2 < 15 

xi,X2>0, integers 


10.6 Solve the following problem using Gomory’s cutting plane method: 


subject to 

Maximize / = 6 xi + 7 x 2 

7xi + 6 x 2 < 42 
5xi + 9 x 2 < 45 
Xl — X2 < 4 

Xj > 0 and integer, * = 1,2 


10.7 Solve the following problem using Gomory’s cutting plane method: 


subject to 

Maximize / = x i + 2 x 2 

xi + X 2 < 7 
2xi <11, 2 x 2 < 7 

x, > 0 and integer, * = 1,2 


10.8 Express 187 in binary form. 

10.9 Three cities A, B, and C are to be connected by a pipeline. The distances between A and 
B, B and C, and C and A are 5, 3, and 4 units, respectively. The following restrictions 
are to be satisfied by the pipeline: 

1. The pipes leading out of A should have a total capacity of at least 3. 
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2. The pipes leading out of B or of C should have total capacities of either 2 or 3 . 

3. No pipe between any two cities must have a capacity exceeding 2 . 

Only pipes of an integer number of capacity units are available and the cost of a pipe is 
proportional to its capacity and to its length. Determine the capacities of the pipe lines 
to minimize the total cost. 

10.10 Convert the following integer quadratic problem into a zero-one linear programming 
problem: 


subject to 

Minimize / = 2 x\ + 3 *? + 4xiX2 — 6*i — 3 x 2 


X\ + X2 < 1 


2 x\ + 3 x 2 < 4 


*i , *2 > 0, integers 


10.11 Convert the following integer programming problem into an equivalent zero-one pro- 
gramming problem: 

Minimize / = 6*i — *2 


subject to 



3 *i — X2 > 4 


2 *i + *2 > 3 


— *1 — *2 > —3 


*i, *2 nonnegative integers 


10.12 Solve the following zero-one programming problem using an exhaustive enumeration 
procedure: 


subject to 

Maximize / = — 10 * i — 5*2 — 3*3 


*1 + 2*2 + *3 > 4 


2*1 + *2 + *3 <6 


Xj = 0 or 1 , f = 1 , 2,3 


10.13 Solve the following binary programming problem using an exhaustive enumeration pro- 
cedure: 


subject to 

Minimize / = — 5 *i + 7*2 + 10*3 ~ 3*4 + *5 


*1 + 3*2 — 5*3 + *4 + 4*5 < 0 


2 *i + 6*2 — 3*3 4 - 2*4 + 2*5 > 4 


*2 — 2*3 — *4 + *5 < —2 


xi = 0 or 1 , i = 1 , 2 , . . . , 5 
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10.14 Find the solution of Problem 10.1 using the branch-and-bound method coupled with the 
graphical method of solution for the branching problems. 

10.15 Find the solution of the following problem using the branch-and-bound method coupled 
with the graphical method of solution for the branching problems: 

Maximize / = x\ — 4x2 

subject to 

jci — X 2 > —4, 4*1 + 5x2 5 45 

5xi — 2x2 S 20, 5xi + 2x2 > 10 
x; > 0 and integer, i = 1,2 

10.16 Solve the following mixed integer programming problem using a graphical method: 

Minimize / = 4xi + 5x2 

subject to 

10xi + X 2 > 10, 5xi + 4x2 > 20 
3xi + 7x2 >21, X 2 + 12x2 > 12 
xi > 0 and integer, X 2 > 0 

10.17 Solve Problem 10.16 using the branch-and-bound method coupled with a graphical 
method for the solution of the branching problems. 

10.18 Convert the following problem into an equivalent zero-one LP problem: 

Maximize / = X 1 X 2 

subject to 

Xj + x| < 25, X,- > 0 and integer, i = 1,2 


10.19 Consider the discrete variable problem: 

Maximize / = X 1 X 2 


subject to 


xj + xf < 4 

xi e {0.1,0.5, 1.1, 1.6, 2.0} 
x 2 € {0.4.0.8, 1.5, 2.0} 


Approximate this problem as a zero-one LP problem at the vector. X° = {J'g}. 
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10.20 Find the solution of the following problem using a graphical method based on the 
generalized penalty function approach: 

Minimize / = x 


subject to 


x — 1 > 0 with x = {1, 2 , 3 , . . .} 


Select suitable values of and Sk to construct the <pk function. 

10.21 Find the solution of the following binary programming problem using the MATLAB 
function bintprog: 

Minimize f 7 x subject to Ax < b and Aeq x = beq 

where 
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Find the solution of the following binary programming problem using the MATLAB 
function bintprog: 

Minimize f T x subject to Ax < b 


where 

f = {- 2 -3 -1 -4 -3 -2 -2 -1 - 3 } 

X = {X\ X 2 *3 *4 *5 *6 *7 *8 Xg} T 
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11.1 INTRODUCTION 

Stochastic or probabilistic programming deals with situations where some or all of 
the parameters of the optimization problem are described by stochastic (or random or 
probabilistic) variables rather than by deterministic quantities. The sources of random 
variables may be several, depending on the nature and the type of problem. For instance, 
in the design of concrete structures, the strength of concrete is a random variable since 
the compressive strength of concrete varies considerably from sample to sample. In 
the design of mechanical systems, the actual dimension of any machined part is a 
random variable since the dimension may lie anywhere within a specified (permissible) 
tolerance band. Similarly, in the design of aircraft and rockets the actual loads acting 
on the vehicle depend on the atmospheric conditions prevailing at the time of the flight, 
which cannot be predicted precisely in advance. Hence the loads are to be treated as 
random variables in the design of such flight vehicles. 

Depending on the nature of equations involved (in terms of random variables) in 
the problem, a stochastic optimization problem is called a stochastic linear, geometric, 
dynamic , or nonlinear programming problem. The basic idea used in stochastic pro- 
gramming is to convert the stochastic problem into an equivalent deterministic problem. 
The resulting deterministic problem is then solved by using familiar techniques such as 
linear, geometric, dynamic, and nonlinear programming. A review of the basic concepts 
of probability theory that are necessary for understanding the techniques of stochastic 
programming is given in Section 11.2. The stochastic linear, nonlinear, and geometric 
programming techniques are discussed in subsequent sections. 


11.2 BASIC CONCEPTS OF PROBABILITY THEORY 

The material of this section is by no means exhaustive of probability theory. Rather, 
it provides the basic background necessary for the continuity of presentation of this 
chapter. The reader interested in further details should consult Parzen [11.1], Ang and 
Tang [11.2], or Rao [11.3]. 

11.2.1 Definition of Probability 

Every phenomenon in real life has a certain element of uncertainty. For example, 
the wind velocity at a particular locality, the number of vehicles crossing a bridge, 
the strength of a beam, and the life of a machine cannot be predicted exactly. These 
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phenomena are chance dependent and one has to resort to probability theory to describe 
the characteristics of such phenomena. 

Before introducing the concept of probability, it is necessary to define certain terms 
such as experiment and event. An experiment denotes the act of performing something 
the outcome of which is subject to uncertainty and is not known exactly. For example, 
tossing a coin, rolling a die, and measuring the yield strength of steel can be called 
experiments. The number of possible outcomes in an experiment may be finite or 
infinite, depending on the nature of the experiment. The outcome is a head or a tail 
in the case of tossing a coin, and any one of the numbers 1, 2, 3, 4, 5, and 6 in the 
case of rolling a die. On the other hand, the outcome may be any positive real number 
in the case of measuring the yield strength of steel. An event represents the outcome 
of a single experiment. For example, realizing a head on tossing a coin, getting the 
number 3 or 5 on rolling a die, and observing the yield strength of steel to be greater 
than 20,000 psi in measurement can be called events. 

The probability is defined in terms of the likelihood of a specific event. If £ 
denotes an event, the probability of occurrence of the event E is usually denoted by 
£(£). The probability of occurrence depends on the number of observations or trials. 
It is given by 

P(E) — lint — (11.1) 

n->oo n 

where m is the number of successful occurrences of the event E and n is the total 
number of trials. From Eq. (11.1) we can see that probability is a nonnegative number 
and 


0 < P(E) < 1.0 (11.2) 

where P(E) = 0 denotes that the event is impossible to realize while P(E) — 1.0 
signifies that it is certain to realize that event. For example, the probability associated 
with the event of realizing both the head and the tail on tossing a coin is zero (impossible 
event), while the probability of the event that a rolled die will show up any number 
between 1 and 6 is 1 (certain event). 

Independent Events. If the occurrence of an event E\ in no way affects the probabil- 
ity of occurrence of another event £3, the events E\ and £2 are said to be statistically 
independent . In this case the probability of simultaneous occurrence of both the events 
is given by 


P{E\Ei) — P(E\)P(Et) (11.3) 

For example, if £(£ 1 ) = £ (raining at a particular location) = 0.4 and P ( £ 1 ) = 
/’(realizing the head on tossing a coin) = 0.7, obviously E\ and £2 are statistically 
independent and 

P(£i£ 2 ) = P(£i)P(£ 2 ) = 0.28 


11.2.2 Random Variables and Probability Density Functions 

An event has been defined as a possible outcome of an experiment. Let us assume that 
a random event is the measurement of a quantity X, which takes on various values in 
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the range — oo to oo. Such a quantity (like X) is called a random variable. We denote 
a random variable by a capital letter and the particular value taken by it by a lowercase 
letter. Random variables are of two types: (1) discrete and (2) continuous. If the random 
variable is allowed to take only discrete values jci, x %, . . . , x„, it is called a discrete 
random variable. On the other hand, if the random variable is permitted to take any 
real value in a specified range, it is called a continuous random variable. For example, 
the number of vehicles crossing a bridge in a day is a discrete random variable, 
whereas the yield strength of steel can be treated as a continuous random variable. 

Probability Mass Function (for Discrete Random Variables). Corresponding to each 

Xj that a discrete random variable X can take, we can associate a probability of occur- 
rence P(xj). We can describe the probabilities associated with the random variable X 
by a table of values, but it will be easier to write a general formula that permits one to 
calculate Pix,) by substituting the appropriate value of x, . Such a formula is called the 
probability mass function of the random variable X and is usually denoted as f x (xj), 
or simply as f(xj). Thus the function that gives the probability of realizing the random 
variable X = x, is called the probability mass function f x (xj ) . Therefore, 


Cumulative Distribution Function ( Discrete Case). Although a random variable X 
is described completely by the probability mass function, it is often convenient to 
deal with another, related function known as the probability distribution function . The 
probability that the value of the random variable X is less than or equal to some number 
x is defined as the cumulative distribution function Fx (x ) . 


where summation expends over those values of i such that Xi < x. Since the distribu- 
tion function is a cumulative probability, it is also called the cumulative distribution 
function. 

Example 11.1 Find the probability mass and distribution functions for the number 
realized when a fair die is thrown. 

SOLUTION Since each face is equally likely to show up, the probability of realizing 
any number between 1 and 6 is g. 


f(xd = fx(Xi) = P(X = Xi ) 


(11.4) 



(11-5) 


P(X = 1) = P(X = 2) = ■ ■ ■ = P(X = 6) = \ 
fxQ) = fx(2) = ■■■ =f x (6) = \ 


The analytical form of F x (x) is 


F x (x) — — for 1 < x < 6 

6 


It can be seen that for any discrete random variable, the distribution function will 
be a step function. If the least possible value of a variable A is S and the greatest 
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possible value is T, then 

F x (x ) = 0 for all x < S and F x (x ) = 1 for all x >T 

Probability Density Function (Continuous Case). The probability density function 
of a random variable is defined by 

fx(x) dx = P(x < X < x + dx) ( 1 1 .6) 

which is equal to the probability of detecting X in the infinitesimal interval (x, x+ 
dx). The distribution function of X is defined as the probability of detecting X less 
than or equal to x, that is, 

Fx(x)= f fx(x')dx' (11.7) 

J — OO 

where the condition F x (- oo) = 0 has been used. As the upper limit of the integral 
goes to infinity, we have 

/ OO 

f x (x)dx = F x ( oo) = 1 (11.8) 

-OO 

This is called the normalization condition. A typical probability density function and 
the corresponding distribution functions are shown in Fig. 11.1. 

11.2.3 Mean and Standard Deviation 

The probability density or distribution function of a random variable contains all the 
information about the variable. However, in many cases we require only the gross 
properties, not entire information about the random variable. In such cases one computes 
only the mean and the variation about the mean of the random variable as the salient 
features of the variable. 

Mean. The mean value (also termed the expected value or average) is used to 
describe the central tendency of a random variable. 


f x (x) F x (x) 




Figure 11.1 Probability density and distribution functions of a continuous random variable X: 
(a) density function; ( b ) distribution function. 
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Discrete Case. Let us assume that there are n trials in which the random variable 
X is observed to take on the value x\ (n \ times), X 2 («2 times), and so on, and 
n i + «2 + ■ ■ ■ + n m — n. Then the arithmetic mean of X, denoted as X, is given by 


E m m m 

k=l X k n k _ Y'' . n k _ Y^ f ( \ 

■X — — / %k — / x kfx(Xk) 

n ' n ^ 

r=i A-i 


(11.9) 


where n^/n is the relative frequency of occurrence of Xk and is same as the probability 
mass function fx(xk). Hence in general, the expected value, E(X), of a discrete random 
variable can be expressed as 

X — E(X) — ^ifxixi), sum over all i (1 1.10) 


Continuous Case. If f x (x ) is the density function of a continuous random variable, 
X, the mean is given by 


X = ii x = E(X) = 



xfx(x) dx 


( 11 . 11 ) 


Standard Deviation. The expected value or mean is a measure of the central 
tendency, indicating the location of a distribution on some coordinate axis. A measure 
of the variability of the random variable is usually given by a quantity known as the 
standard deviation. The mean-square deviation or variance of a random variable X is 
defined as 


= Var(X) = E[(X - /r x ) 2 ] 


= E[X 2 -2XtL X +p 2 x ] 


= E(X 2 )-2 f i x E(X) + E( f x 2 x ) 


= E{X 2 ) - /4 

(11.12) 

and the standard deviation as 


= +VVar(X) = ^ E {X 2 ) - /x| 

(11.13) 


The coefficient of variation (a measure of dispersion in nondimensional form) is 
defined as 


coefficient of variation of X — y x — 


standard deviation ax 


mean fix 


(11.14) 


Figure 1 1 .2 shows two density functions with the same mean fix but with different 
variances. As can be seen, the variance measures the breadth of a density function. 


Example 11.2 The number of airplane landings at an airport in a minute (X) and 
their probabilities are given by 

x t 0 1 2 3 4 5 6 

p x {xi) 0.02 0.15 0.22 0.26 0.17 0.14 0.04 


Find the mean and standard deviation of X. 
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fxM 



Figure 11.2 Two density functions with same mean. 


SOLUTION 

6 

X = = 0(0.02) + 1(0.15) + 2(0.22) + 3(0.26) 

i=0 

+ 4(0.17) +5(0.14) + 6(0.04) 

= 2.99 

6 

X 2 = J2 x iPx( x r) = 0(0-02) + 1(0.15) +4(0.22) +9(0.26) 
;=o 

+ 16(0.17) +25(0.14) + 36(0.04) 

= 11.03 

Thus 

trf = X 2 - (X) 2 = 11.03 - (2.99) 2 = 2.0899 or a x = 1.4456 


Example 11.3 The force applied on an engine brake (X) is given by 


fx(x ) = 


48 

12 -x 
24 ’ 


0 < x < 8 lb 


8 < x < 121b 


Determine the mean and standard deviation of the force applied on the brake. 
SOLUTION 


r°° r x r n 12 — x 

li x — E[X] — / xfx(x)dx— I x—dx+ / x — — — dx — 6.6667 


E[X 2 ] 


■/. 


— OO 

oo 


x 2 fx(x) dx 


[* x [' 

= /o X M dX + h 

= f x 2 —dx+ f 

Jo 48 Js 


24 


12 


12 — x 


■ dx 


24 
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= 21.3333 + 29.3333 = 50.6666 
<x 2 = E[X 2 ] - (£[X ]) 2 = 50.6666 - (6.6667) 2 
= 6.2222 or <r x = 2.4944 


11.2.4 Function of a Random Variable 

If X is a random variable, any other variable Y defined as a function of X will also be 
a random variable. If fx(x) and F x (x) denote, respectively, the probability density and 
distribution functions of X, the problem is to find the density function fy(y) and the 
distribution function F Y {y) of the random variable Y . Let the functional relation be 

Y = g{X) (11.15) 


By definition, the distribution function of Y is the probability of realizing Y less than 
or equal to y : 


F Y (y) = P(Y <y) = P(g < y) 


=L 


fx(x)dx 


g(x)<y 


(11.16) 


where the integration is to be done over all values of x for which g(x) < y. 

For example, if the functional relation between y and x is as shown in Fig. 1 1 .3, 
the range of integration is shown as Axj + Ax 2 + AX 3 + ■ ■ ■. The probability density 
function of Y is given by 


fr(y) = — [F r (y)] 

dy 

If Y — g(X), the mean and variance of Y are defined, respectively, by 

/ OO 

g(x)fx(x)dx 

-CO 


Var[F] 


-r 

J —a 


[g(x) - E{Y)ffx{x)dx 


(11.17) 

(11.18) 
(11.19) 


y = g(x) 



Figure 11.3 Range of integration in Eq. (11.16). 
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11.2.5 J ointly Distributed Random Variables 

When two or more random variables are being considered simultaneously, their joint 
behavior is determined by a joint probability distribution function. The probability 
distributions of single random variables are called univariate distributions and the 
distributions that involve two random variables are called bivariate distributions. In 
general, if a distribution involves more than one random variable, it is called a multi- 
variate distribution . 

Joint Density and Distribution Functions. We can define the joint density function 
of n continuous random variables X \ . X 2 , . ■ . , X n as 

fxi,...,x n (xi, . . .,x n )dx 1 ■ ■ ■ dx n = P(x 1 < Xi < x l + dx 1 , 

x 2 < X 2 < X 2 + dx 2 , . . . , X n < X n < x n + dx„) ( 1 1 .20) 

If the random variables are independent, the joint density function is given by the 
product of individual or marginal density functions as 


fx 1 ,...,x n (x 1 , ...,x n ) = f X] (*i) • ■ ■ fx n (x„) 

The joint distribution function 

Fxi,x 2 x„ (xi,x 2 , . ■ -,x n ) 

associated with the density function of Eq. ( 1 1 .20) is given by 

Fx!,...,x n (xu ...,x n ) 

= P[X 1 < x\, . . . , X n < x n ~\ 

/ xi nx n 

■■■ / fxi x n (x[,x' 2 ,..., x' n ) dx[ dx 2 ■ ■ ■ dx' n 

-OO J — OO 

If Xi, X 2 , . . . , X n are independent random variables, we have 

Fx u ...,x n (Xl, . . . , Xn) = F Xl (x 1 ) Fx 2 (x 2 ) ■ ' • Fx n (x n ) 


( 11 . 21 ) 


( 11 . 22 ) 


( 11 . 23 ) 


It can be seen that the joint density function can be obtained by differentiating the joint 
distribution function as 


fx ll ...,x n (xi, ...,x n ) 


9” 

dxi dx 2 ■ ■ ■ dx„ 


Fxi x„(xi, ...,x n ) 


( 11 . 24 ) 


Obtaining the Marginal or Individual Density Function from the Joint Density 
Function. Let the joint density function of two random variables X and Y be denoted 

by / (x, y) and the marginal density functions of X and Y by f x (x) and fy(y), respec- 
tively. Take the infinitesimal rectangle with corners located at the points (x. y), (x + dx, 
_y), (x, y + dy), and (x + dx, y + dy). The probability of a random point (x', y') falling 
in this rectangle is f x ,Y ( x , >0 dx dy. The integral of such probability elements with 
respect to y (for a fixed value of x) is the sum of the probabilities of all the mutually 
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exclusive ways of obtaining the points lying between x and x + dx. Let the lower and 
upper limits of y be a\(x) and b\(x). Then 


' rb iO) 

P[x < x' < x + dx] — / fx,y{x,y)dy 

-Ja\(x) 


dx — f x (x ) dx 


ryi=bi(x) 

fx(x) = / fx,r(x,y)dy 

Jyi=ai(x) 


Similarly, we can show that 


rx 2 =b 2 (y) 

friy) = / f x ,r(x,y)dx 

Jx\ =a 2 (y) 


(11.25) 


(11.26) 


11.2.6 C ovariance and C orrelation 


If X and Y are two jointly distributed random variables, the variances of X and Y are 
defined as 


CO 


E[(X - X) 2 ] = Var[X] = 

/ (x - X) 2 f x (x) dx 

1 —co 

(11.27) 

E[(Y — F) 2 ] = Var[F] = 

p OO 

/ (y-Y) 2 fr(y)dy 

(11.28) 


— CO 


and the covariance of X and Y as 

E[(X - X)(Y - F)] = Cov(X, F) 


/ OO f OO 

/ (x - X)(y - Y)J x ,y(x, y)dxdy 
-CO J —CO 

ffx.r (11-29) 


The correlation coefficient, p x ,y , for the random variables is defined as 

Cov(Z, F) 


PX,Y 


a X CTy 


(11.30) 


and it can be proved that — 1 < px.r < 1. 


11.2.7 Functions of Several Random Variables 

If F is a function of several random variables X \ , Xi , . . . , X n , the distribution and den- 
sity functions of F can be found in terms of the joint density function of Xi, X 2 , . . . , X n 
as follows: 

Let 


F = g(X\, X 2 , . . . , X n ) 


(11.31) 
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Then the joint distribution function F Y {y), by definition, is given by 


F Y (y) = P(Y < y) 


= / "'/ x 2 , ■ ■ ■ ,x„)dxi dx 2 ■ ■ -dx n 

J X\ J X 2 J X n 


g(xi,X2,...,x n )-<y 


(11.32) 


where the integration is to be done over the domain of the n -dimensional 
(X\,X 2 , ... ,X n ) space in which the inequality g(xi, X 2 , ■ ■ ■ , x n ) < y is satisfied. By 
differentiating Eq. (11.32), we can get the density function of y, f Y {y). 

As in the case of a function of a single random variable, the mean and variance 
of a function of several random variables are given by 

/ OO poo 

■ g(x U X 2 ,...,X n )f Xl X 2 X n 

-OO J —OO 

x (jc i , X 2 , ■ ■ ■ , x n )dx i dx 2 ■ ■ ■ dx n (11 .33) 


and 

/ OO r. OO 

f [g(xi,x 2 ,x„)-E] 

-oo J —oo 

x fxi.x 2 ...x n (xi,x 2 , . . . , x n )dx\ dx 2 ■ ■ ■ dx n (11.34) 

In particular, if Y is a linear function of two random variables X\ and Xn, we have 

Y — a 1 X 1 + a 2 X 2 

where a\ and a 2 are constants. In this case 

/ OO r OO 

/ (a\x\ + a 2 x 2 ) fx u x 2 (xi, x 2 )dx\ dx 2 

-oo J —oo 

/ OO POO 

xif Xl (xi)dxi+a 2 x 2 f X2 (x 2 )dx 2 

-OO J —OO 

= a l E(X l )+a 2 E(X 2 ) (11.35) 

Thus the expected value of a sum is given by the sum of the expected values. The 
variance of Y can be obtained as 

Var(T) = E[(a x X x + a 2 X 2 ) - (a,Z + a 2 X 2 )f 

= E[a l (X l — ZO + a 2 (X 2 — ~X 2 )] 2 

= E[aj(X l — Xj) 2 + 2ci\a 2 (X\ - Xi)(X 2 - X 2 ) + a\(X 2 - X 2 f] (11.36) 

Noting that the expected values of the first and the third terms are variances, whereas 
that the middle term is a covariance, we obtain 

Var(T) =aiVar(Xi)+a|Var(X 2 )+2aia 2 Cov(Xi,X2) (11.37) 
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These results can be generalized to the case when Y is a linear function of several 
random variables. Thus if 


n 

Y = J2 a i x i (11.38) 

1=1 

then 

n 

E(Y) = J2 a i E ( x i) (H.39) 

1 = 1 

n n n 

Var(T) = ^a?Var(X,-) + EE QiCij Cov(X,, X j), i / j (11.40) 

1 = 1 i=l 7 = 1 

Approximate Mean and Variance of a Function of Several Random Variables. 

If Y = g(X i, ... ,X n ), the approximate mean and variance of Y can be obtained as 
follows. Expand the function g in a Taylor series about the mean values X \ , X 2 , ■ ■ ■ , X n 
to obtain 


Y = g(XuX 2 ,...,X n ) + J'{X i -X i )-£- 

,=i dXi 

1 n n 

1 x — > x — > — — C' J? 

+ -> > (Xi -Xi)(Xj -Xj) — + ... (11.41) 

2 J J JJ dXjdX j K 

1=1 j= 1 1 

where the derivatives are evaluated at (Xi, Xi, . . . , X„). By truncating the series at 
the linear terms, we obtain the first-order approximation to Y as 

(11.42) 

(x u x 2 ....,x n ) 

The mean and variance of Y given by Eq. ( 1 1 .42) can now be expressed as [using 
Eqs. (11.39) and (11.40)] 


Y ~ g(X u X 2 , . . . , X„) + ^ (Xi - X^ -f 

O J\. j 


E(Y)~g(X lt X 2 ,...,X n ) (11.43) 

n n n 

Var(T) ~ ^cfVar(X,) + EE CiCj Cov(Xi, Xj), i / j (11.44) 

i=l 1=1 7=1 

where c, and cj are the values of the partial derivatives dg/dXj and dg/dXj, respec- 
tively, evaluated at (X\,X 2 , . . . ,X n ). 

It is worth noting at this stage that the approximation given by Eq. (11.42) 
is frequently used in most of the practical problems to simplify the computations 
involved. 
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11.2.8 Probability Distributions 

There are several types of probability distributions (analytical models) for describ- 
ing various types of discrete and continuous random variables. Some of the common 
distributions are given below: 


Discrete case 


Continuous case 


Discrete uniform distribution 

Binomial 

Geometric 

Multinomial 

Poisson 

Hypergeometric 

Negative binomial (or Pascal’s) 


Uniform distribution 

Normal or Gaussian 

Gamma 

Exponential 

Beta 

Rayleigh 

Weibull 


In any physical problem, one chooses a particular type of probability distribution 
depending on (1) the nature of the problem, (2) the underlying assumptions associated 
with the distribution, (3) the shape of the graph between f (x ) or F(x ) and x obtained 
after plotting the available data, and (4) the convenience and simplicity afforded by the 
distribution. 


Normal Distribution. The best known and most widely used probability distribution 
is the Gaussian or normal distribution. The normal distribution has a probability density 
function given by 

fx(x ) = — _ 0Q < x < oo (11.45) 

V27 zax 

where pt x and cr x are the parameters of the distribution, which are also the mean and 
standard deviation of X, respectively. The normal distribution is often identified as 
N(px, o x ). 

Standard Normal Distribution. A normal distribution with parameters /i x — 0 and 
ax — 1, called the standard normal distribution, is denoted as ATO, 1). Thus the density 
function of a standard normal variable (Z) is given by 

1 2 

fz(z) — , / 2 \ — oo < z < oo (11.46) 

s/2n 

The distribution function of the standard normal variable (Z) is often designated as 
cf>(z) so that, with reference to Fig. 11.4, 

4>izi) = p and zi = 0 — 1 ip) (11.47) 

where p is the cumulative probability. The distribution function ATO, 1) [i.e., (j)(z)\ is 
tabulated widely as standard normal tables. For example, Table 11.1, gives the values 
of z, f(z), and 0(z) for positive values of z. This is because the density function is 
symmetric about the mean value (z — 0) and hence 


f(-z) = Hz) 
0(-z) = l - Hz) 


(11.48) 

(11.49) 
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By the same token, the values of z corresponding to p < 0.5 can be obtained as 

z - - -0 _1 (1 - p) (11.50) 

Notice that any normally distributed variable (X) can be reduced to a standard normal 
variable by using the transformation 


x - P-x 
o x 


For example, if P(a < X < b ) is required, we have 

r*b 

cr X y/2n , 

By using Eq. (11.51) and dx — o x dz, Eq. (11.52) can be rewritten as 

'(b-n x )lax 


P(a < X < b) — 1 -= f e~ a/mx ~ llx)/axi ' dx 

■s/2 7T Ja 


i r 

P(a < X < b) — —= / e 

V 2 7T J (a 


A ' 2 dz 


(11.51) 


(11.52) 


(11.53) 


(a-iix)/ox 


This integral can be recognized to be the area under the standard normal density curve 
between (, a — Hx)/°x an d (b — ix x )/o x and hence 


P(a < X < b) = 0 


b- p x 
o x 


0 


a ~ Px 
o x 


(11.54) 


Example 11.4 The width of a slot on a duralumin forging is normally distributed. 
The specibcation of the slot width is 0.900 ± 0.005. The parameters /i — 0.9 and 
a — 0.003 are known from past experience in production process. What is the percent 
of scrap forgings? 

SOLUTION If X denotes the width of the slot on the forging, the usable region is 
given by 


0.895 < x < 0.905 
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Table 11.1 Standard Normal Distribution Table 


z 

f(z) 

<Kz) 

0.0 

0.398942 

0.500000 

0.1 

0.396952 

0.539828 

0.2 

0.391043 

0.579260 

0.3 

0.381388 

0.617912 

0.4 

0.368270 

0.655422 

0.5 

0.352065 

0.691463 

0.6 

0.333225 

0.725747 

0.7 

0.312254 

0.758036 

0.8 

0.289692 

0.788145 

0.9 

0.266085 

0.815940 

1.0 

0.241971 

0.841345 

1.1 

0.217852 

0.864334 

1.2 

0.194186 

0.884930 

1.3 

0.171369 

0.903199 

1.4 

0.149727 

0.919243 

1.5 

0.129518 

0.933193 

1.6 

0.110921 

0.945201 

1.7 

0.094049 

0.955435 

1.8 

0.078950 

0.964070 

1.9 

0.065616 

0.971284 

2.0 

0.053991 

0.977250 

2.1 

0.043984 

0.982136 

2.2 

0.035475 

0.986097 

2.3 

0.028327 

0.989276 

2.4 

0.022395 

0.991802 

2.5 

0.017528 

0.993790 

2.6 

0.013583 

0.995339 

2.7 

0.010421 

0.996533 

2.8 

0.007915 

0.997445 

2.9 

0.005952 

0.998134 

3.0 

0.004432 

0.998650 

3.5 

0.000873 

0.999767 

4.0 

0.000134 

0.999968 

4.5 

0.000016 

0.999996 

5.0 

0.0000015 

0.9999997 


and the amount of scrap is given by 

scrap = P(x < 0.895) + P(x > 0.905) 
In terms of the standardized normal variable, 


scrap = P Z < 


-0.9 + 0.895 


P \ Z > 


-0.9 + 0.905 


0.003 

= P(Z < -1.667) + P(Z > +1.667) 


0.003 
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= [1 - P(Z < 1.667)] + [1 - P(Z < 1.667)] 
= 2.0 — 2P(Z < 1.667) 

= 2.0 - 2(0.9525) = 0.095 
= 9.5 % 


Joint Normal Density Function. If X\, X 2 , . . . , X n follow normal distribution, any 
linear function, Y — a\X\ + 01 X 2 + ■ ■ ■ + a n X n , also follows normal distribution with 
mean 


Y — a\X{ + 02 X 2 + • • ■ + a n X n 


(11.55) 


and variance 

Var(F) = a\ Var(Xi) + a\ Var(Z 2 ) + ■ ■ ■ + a 2 n Var(Z„) (1 1.56) 


if X\, X 2 , . . . , X n are independent. In general, the joint normal density function for 
n -independent random variables is given by 


fx u x 2 ,...,x„ (x 1 , X2, ■ ■ - , x n ) 


1 

V(27r) n CT 1 (T2 ■■■<*„ 


exp 



Xk - x k 


Ok 


2 


= fx l (x0fx 2 (x 2 )---fx n (xn) (11.57) 

where rr, = oxi ■ If the correlation between the random variables X k and X t is not zero, 
the joint density function is given by 


fx y ,x 2l ...,x n {x\, X2, ■ ■ ■ , x„) 


V&Fr ik 


■ exp 


EE {K ~ l } Jk (xj - Xj)(x k - X k ) 


(11.58) 


j=i k= 1 


where 


K Xj x k - Kjk = E[(xj - Xj)(x k - X k )] 


-n 

J —co J —co 


(Xj - X/)(x k - X k )f Xj ,x k (xj,x k )dxjdx k 


= convariance between Xj and X k 


K = correlation matrix = 


K n K n K u 
Ki\ K 2 2 ■ ■ ■ K 2 „ 


X n I X n 2 


K n 


(11.59) 


and {K 1 \j k = jkth element of K 1 . It is to be noted that K x :j x k — 0 lor / / k and = 
a\. for j — k in case there is no correlation between Xj and X k . 
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11.2.9 Central Limit Theorem 

If Xi, X 2 , . . . , X n are n mutually independent random variables with finite mean and 
variance (they may follow different distributions), the sum 

n 

S n = J2 X ‘ (U ' 60) 

1 = 1 

tends to a normal variable if no single variable contributes significantly to the sum as 
n tends to infinity. Because of this theorem, we can approximate most of the physical 
phenomena as normal random variables. Physically, S n may represent, for example, the 
tensile strength of a fiber -reinforced material, in which case the total tensile strength 
is given by the sum of the tensile strengths of individual fibers. In this case the ten- 
sile strength of the material may be represented as a normally distributed random 
variable. 


11.3 STOCHASTIC LINEAR PROGRAMMING 


A stochastic linear programming problem can be stated as follows: 

n 

Minimize /(X) = C r X = CjXj 

j = 1 


subject to 


n 

A ; r X = ciijXj < bi, i = 1, 2, . . . , m 
j= 1 


xj > 0, j = 1 , 2, . . . , n 


(11.61) 


(11.62) 

(11.63) 


where cj, a,-,-, and b/ are random variables (the decision variables xj are assumed to 
be deterministic for simplicity) with known probability distributions. Several methods 
are available for solving the problem stated in Eqs. (11.61) to (11.63). We consider a 
method known as the chance-constrained programming technique, in this section. 

As the name indicates, the chance-constrained programming technique can be used 
to solve problems involving chance constraints, that is, constraints having finite proba- 
bility of being violated. This technique was originally developed by Charnes and Cooper 
[11.5]. In this method the stochastic programming problem is stated as follows: 

n 

Minimize /(X) = CjXj (11.64) 

7=1 


subject to 


J2 a d x t < b ‘ 


;'=i 


Pi , 


1 , 2 , 


(11.65) 


Xj > 0, j = 1, 2, . . . , n 


( 11 . 66 ) 
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where Cj, dij, and bj are random variables and p, are specified probabilities. Notice 
that Eqs. (11.65) indicate that the ith constraint, 

n 

Y, a ‘i x J - b < 

j = i 

has to be satisfied with a probability of at least p ,• where 0 < pi < 1. For simplicity, 
we assume that the design variables xj are deterministic and Cj, a,j, and b, are random 
variables. We shall further assume that all the random variables are normally distributed 
with known mean and standard deviations. 

Since cj are normally distributed random variables, the objective function /(X) 
will also be a normally distributed random variable. The mean and variance of / are 
given by 


n 

J =Y^CjXj 


7=1 

Var (/) =X r VX 


(11.67) 

( 11 . 68 ) 


where Cj is the mean value of Cj and the matrix V is the covariance matrix of cj 
defined as 


Var(ci) Cov(ci,C 2 ) ••• Cov(ci,c„) 
Cov(c 2 , ci) Var(c 2 ) • • • Cov(c 2 , c„) 

Cov(c„,ci) Co v(c„,c 2 ) ■■■ Var(c n ) 


(11.69) 


with Var(c ; ) and Co v(c,-,c ; ) denoting the variance of cj and covariance between a 
and cj, respectively. A new deterministic objective function for minimization can be 
formulated as 


F(X) =£i/ + fc 2 yVar(/) (11.70) 

where k\ and k 2 are nonnegative constants whose values indicate the relative importance 
of / and standard deviation of / for minimization. Thus k^— 0 indicates that the 
expected value of / is to be minimized without caring for the standard deviation of 
/. On the other hand, if k\ = 0, it indicates that we are interested in minimizing the 
variability of / about its mean value without bothering about what happens to the mean 
value of /. Similarly, if /rj — k 2 — 1, it indicates that we are giving equal importance 
to the minimization of the mean as well as the standard deviation of /. Notice that the 
new objective function stated in Eq. (1 1.70) is a nonlinear function in X in view of the 
expression for the variance of /. 

The constraints of Eq. (11.65) can be expressed as 


P[hi<0]>pi, i — 1,2, ... ,m 


(11.71) 
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where hi is a new random variable defined as 

n n + 1 

hi = Y^aijXj - b, = Y, q ik yk 

j = l k=\ 


(11.72) 


where 


Qik — ®iki k — 1 , 2 , . . . , fl <2i,n - (-1 — b( 

Vk=x k , k=l,2,...,n, y n+ i — -1 


Notice that the constant y n+ \ is introduced for convenience. Since /j is given by a 
linear combination of the normally distributed random variables q ik , it will also follow 
normal distribution. The mean and the variance of hj are given by 


72+1 n 

hi = y« ik yk = yaijxj - bi 

k= 1 y'=l 

War (hi) = Y r V,Y 


where 


Var(^,i) 
Cov(q i2 , qn) 


Y = 


V] 

J2 


y n + 1 


Cov(q n ,q i2 ) 
Var (q i2 ) 


Cov(q iM+x , q n ) klov (q ; n ^ \ , q j 2 ) 


f o v (q l ] . t^/.n+l) 
Cov(^/ 2 , qi,n+ 1 ) 


Var(g,„ + |) 


This can be written more explicitly as 


(11.73) 

(11.74) 


(11.75) 


(11.76) 


72 + 1 


72+1 


Var (/;,•) = X 
k= 1 L 

72 

= E 


y* Var^/i) +2 X y,ty/ Cov(q lk , q n ) 


l=k + 1 


yf Varfet) + 2 X y*y/ Cov(q lk , qn) 
k= l L /=A+1 


+ }+i Var(^-,„ +i ) + 2y^ +l Cov(</,-, n+ i , <7<\*+i) 

72 

+ XI CoY (<lik, Cli,n + 1)] 

(fc=l 
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£ 

k= 1 L 


Var(a ik )+2 ^ x k x t Co v(a ik ,au) 


l=k + 1 


+ Var (bi) — 2 ^ x k Co v(a ;J t, bj) 


k= 1 


Thus the constraints in Eqs. (11.71) can be restated as 


< 


-A.- 


A; ~ A/ 

VVar(/t,-) ~ VVar(A,-) 


> pi, i = 1, 2, . . . , m 


(11.77) 


(11.78) 


where [(/;, — A,-)]/VVar(A,-) represents a standard normal variable with a mean value 
of zero and a variance of 1 . 

Thus if si denotes the value of the standard normal variable at which 


fp(si) = Pi 

the constraints of Eq. (11.78) can be stated as 


<t> 


( -hj ' 
WVar(/i/) 


> 4>(Sj), i = 1,2, , m 


(11.79) 


(11.80) 


These inequalities will be satisfied only if the following deterministic nonlinear inequal- 
ities are satisfied: _ 

-hi . 

> Si, i = 1 , 2 , . . . , m 


or 


VVar (hi) 

hi + Si -y/Var (/),) <0, i = 1 , 2, . . . , m 


(11.81) 


Thus the stochastic linear programming problem of Eqs. (11.64) to (11.66) can be 
stated as an equivalent deterministic nonlinear programming problem as 


n 

Minimize F(X) = k\ ^ \cjXj + k 2 s/X r VX , k\ >0, kj > 0, 
y=i 

subject to 

hi + Siy/Var(hj) <0, i = 1,2, ... ,m 
xj > 0, j — 1,2, .... n 


(11.82) 


Example 11.5 A manufacturing firm produces two machine parts using lathes, milling 
machines, and grinding machines. If the machining times required, maximum times 
available, and the unit profits are all assumed to be normally distributed random vari- 
ables with the following data, find the number of parts to be manufactured per week 
to maximize the profit. The constraints have to be satisfied with a probability of at 
least 0.99. 
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Machining time required per unit (min) 

Maximum time 


Part I 

Part II 

available 
per week (min) 

Type of 
machine 

Mean 

Standard 

deviation 

Mean 

Standard 

deviation 

Mean 

Standard 

deviation 

Lathes 

an = 10 

<7a u = 6 

a 12 = 5 

= 4 

bi = 2500 

Ob = 500 

Milling 

machines 

a 2 i = 4 

03 l 2 i = 4 

C 122 — 10 

^ a 22 — ^ 

b 2 = 2000 

Ob 2 = 400 

Grinding 

machines 
Profit per unit 

«31 = 1 

ci = 50 

^031 — ^ 
a c j = 20 

«32 = 1-5 

c 2 = too 

CT <I32 — ^ 

ct C2 = 50 

b 3 = 450 

£ 

II 

Ln 

O 


SOLUTION By defining new random variables /r, as 

n 

hi = ^ ciijXj - bj, 

7 = 1 

we find that /?,■ are also normally distributed. By assuming that there is no correlation 
between a ,/ s and s, the means and variances of hi can be obtained from Eqs. (1 1.73) 

and (11.77) as 

hi — a n xi + a 12 X 2 — b\ — 10*i + 5^2 — 2500 
h 2 = 021*1 + 022*2 ~b 2 = 4xi + 10x 2 - 2000 
h 2 — 031*1 + 032*2 — b 2 — *1 + 1.5*2 — 450 

a h = AA n + A a li2 + % = 36x i + 16x 2 + 250,000 

a h 2 — x \ a a 2 \ + x 2°li2 + °b 2 — 16*i + 49*? + 160,000 

a h 3 = X 1 a a 3l + X 2 a a 32 + a b 3 — 4x l + 9*2 + 2500 

Assuming that the profits are independent random variables, the covariance matrix of 
Cj is given by 

_ rVar(ci) 0 
0 Var(c 2 ) 

and the variance of the objective function by 

Var(/) = X 7 VX = 400*? + 2500*? 

Thus the objective function can be taken as 

F — k\ (50*i + 100*2) + ^2^400*? + 2500*? 

The constraints can be stated as 



400 

0 


0 

2500 


P\hi < 0] > pi = 0.99, i = 1, 2, 3 
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As the value of the standard normal variate (.s , ) corresponding to the probability 0.99 
is 2.33 (obtained from Table 11.1), we can state the equivalent deterministic nonlinear 
optimization problem as follows: 

Minimize F = fci(50xi + IOOX 2 ) + kiJ^OOx^ + 2500x2 

subject to 

10xi + 5 x 2 + 2.33^36x2 + 16x| + 250,000 - 2500 < 0 
4x! + 10x 2 + 2.33^16x2 + 49x2 + 160,000 - 2000 < 0 

xi + 1.5x2 + 2.33^4x2 + 9x\ + 2500 - 450 < 0 
XI > 0, X2 > 0 

This problem can be solved by any of the nonlinear programming techniques once the 
values of k\ and /o are specibed. 


11.4 STOCHASTIC NONL I NE AR PROG RAM M I NG 

When some of the parameters involved in the objective function and constraints vary 
about their mean values, a general optimization problem has to be formulated as a 
stochastic nonlinear programming problem. For the present purpose we assume that 
all the random variables are independent and follow normal distribution. A stochastic 
nonlinear programming problem can be stated in standard form as 


Find X which minimizes /( Y ) 


(11.83) 


subject to 


P[gjCf) > 0] > pj, j — 1,2, ... ,m 


(11.84) 


where Y is the vector of N random variables yi, yi, ■ ■ ■ , yN ar >d it includes the decision 
variables xi, X 2 , . . . , x„. The case when X is deterministic can be obtained as a special 
case of the present formulation. Equations (11.84) denote that the probability of real- 
izing g/Cf) greater than or equal to zero must be greater than or equal to the specibed 
probability pj. The problem stated in Eqs. (11.83) and (11.84) can be converted into 
an equivalent deterministic nonlinear programming problem by applying the chance 
constrained programming technique as follows. 


11.4.1 Objective Function 

The objective function /(Y) can be expanded about the mean values of y,, y ( , as 


/(Y) = /(Y) + £( 

^V9y,- 


Y I (y; — y,) + higher-order derivative terms (1 1.85) 
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If the standard deviations of y it cr yi , are small, /(Y) can be approximated by the first 
two terms of Eq. (11.85): 



= fCf) 


( 11 . 86 ) 


If all y t (i — 1,2,..., N) follow normal distribution, i//(Y ), which is a linear function 
of Y, also follows normal distribution. The mean and the variance of x[r are given 
by 

J=fCf) (11.87) 


Var(i/0 = crj = 




( 11 . 88 ) 


since all y, are independent. For the purpose of optimization, a new objective function 
F( Y) can be constructed as 


F(Y) = + k 2 o> 


(11.89) 


where k i > 0 and k 2 > 0, and their numerical values indicate the relative importance 
of t fr and for minimization. Another way of dealing with the standard deviation of 
i jr is to minimize i fr subject to the constraint a,j, < k 2 xf/, where k 2 is a constant, along 
with the other constraints. 


11.4.2 Constraints 


If some parameters are random in nature, the constraints will also be probabilistic and 
one would like to have the probability that a given constraint is satisfied to be greater 
than a certain value. This is precisely what is stated in Eqs. (1 1.84) also. The constraint 
inequality (11.84) can be written as 


f 


fgj (8j)dgj > Pj 


(11.90) 


where f g j(gj ) is the probability density function of the random variable gj (a function 
of several random variables is also a random variable) whose range is assumed to be 
— oo to oo. The constraint function g ; (Y) can be expanded around the vector of mean 
values of the random variables, Y , as 


g ,(Y), Sj (Y, + E(|j 


0; - Yi) 


(11.91) 


From this equation, the mean value, gj, and the standard deviation, a g j, of gj can be 
obtained as 


8j = 8j0 0 


(11.92) 


J gj 



1/2 


(11.93) 
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By introducing the new variable 


and noting that 


e = 


sj ~ 8j 


j gj 



-L e~‘ 2 ' 2 dt 
V2 n 


1 


Eq. ( 1 1 .90) can be expressed as 

1 


/ OO 

-(I 


(gj/vgj) V27T 


r 0 y 2 d& > 


/ 


-4>j(.Pj) 


e ‘ / 2 dt 


(11.94) 


(11.95) 


(11.96) 


where <t>j(Pj) is the value of the standard normal variate corresponding to the proba- 
bility pj. Thus 




SJ 


or 


-gj +CTgj<t> j (Pj) < 0 


(11.97) 


Equation ( 1 1 .97) can be rewritten as 




N 

E 

L f=i 


^§1 

dy t 


-, 1/2 


y / a y i 


>0, 7 = 1.2, 


, , m 


(11.98) 


Thus the optimization problem of Eqs. ( 1 1 .83) and ( 1 1 .84) can be stated in its equivalent 
deterministic form as: minimize F ( Y ) given by Eq. ( 1 1.89) subject to the m constraints 
given by Eq. (11.98). 


Example 11.6 Design a uniform column of tubular section shown in Fig. 11.5 to 
carry a compressive load P for minimum cost. The column is made up of a material 
that has a modulus of elasticity E and density p. The length of the column is /. The 
stress induced in the column should be less than the buckling stress as well as the yield 
stress. The mean diameter is restricted to lie between 2.0 and 14.0 cm, and columns 
with thickness outside the range 0.2 to 0.8 cm are not available in the market. The 
cost of the column includes material costs and construction costs and can be taken as 
5W + 2d, where W is the weight and cl is the mean diameter of the column. The 
constraints have to be satisfied with a probability of at least 0.95. 

The following quantities are probabilistic and follow normal distribution with mean 
and standard deviations as indicated: 


Compressive load = (P , a />) = (2500, 500) kg 
Young’s modulus = ( E,oe ) = (0.85 x 10 6 , 0.085 x 10 6 )kg f /cm 2 
Density = (p,a p ) — (0.0025, 0.00025) kg f /cm 3 
Yield stress = (f y , Of y ) = (500, 50)kg f /cm 2 
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A 

/ TTTT77 


Section A-A 


Figure 11.5 Column under compressive load. 


Mean diameter of the section = (d , cr,i) = (d, 0.0 1 d) 
Column length = (/, er/) = (250, 2.5) cm 


SOLUTION This problem, by neglecting standard deviations of the various quantities, 
can be seen to be identical to the one considered in Example 1.1. We will take the 
design variables as the mean tubular diameter (d) and the tube thickness (t): 



Notice that one of the design variables ( d ) is probabilistic in this case and we assume 
that d is unknown since <7,i is given in term of (d). By denoting the vector of random 
variables as 


>r 


P 



E 

J3 


P 



fy 

J5 


l 

>’6 


d 
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the objective function can be expressed as /(Y) = 5 W + 2d — 5 pin dt + 2d. Since 


p 


'2500 

~E 


0.85 x 10 6 

P 


0.0025 

fy 


500 

l 


250 

d 


d 


/( Y ) = 5plndt + 2d — 9.8175 dt + 2d 


9/ 


df_ 

__ df_ 

dyi 

Y " 

dy2 

Y dy 4 


Y = 5 nldt — 3927.0 dt 

Y — 5n~pdt — 0.03927 dt 

Y = 5 7i pit + 2 = 9.8175 1 + 2.0 

Equations (11.87) and (11.88) give 

fCt) = 9M75dt +2d (Ej) 


a/ 

dy3 

df_ 

dy5 

df_ 

dy6 


crj = (3921 .Qdt) 2 o 2 + (0 .03927 dt) 2 of + (9.8175t + 2.0) 2 crj 

= 0.9835dV + 0.0004J 2 + 0.003927 d 2 t (E 2 ) 

Thus the new objective function for minimization can be expressed as 
F(d , t) = kix// + k 2 a ,j, 

= k\(9.ft\15dt + 2d) + k 2 (0.9835d 2 t 2 + 0.0004J 2 + 0.003927 d 2 t) l/2 (E 3 ) 


where k\ > 0 and k 2 > 0 indicate the relative importances of it and rr,/, for minimiza- 
tion. By using the expressions derived in Example 1 . 1 , the constraints can be expressed 
as 


P[giCf)<0] = P 



>0.95 


pigicn <o] = 

P[g30f) <0] = 
Ptoon <o] - 
P[gs0f) <0] = 
E[g 6 (Y) <0] = 



7T 2 E 

-ftp- 


id 2 + t 2 ) < 0 


P[-d + 2.0 < 0] > 0.95 
P[d - 14.0 < 0] > 0.95 
P[-t + 0.2 < 0] > 0.95 
P[t - 0.8 < 0] > 0.95 


> 0.95 


(E 4 ) 

(E 5 ) 

(Eg) 

(E 7 ) 

(Eg) 

(E 9 ) 
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The mean values of the constraint functions are given by Eq. (11.92) as 


P - 2500 

f i - -=- - fy = -=~ - 500 

ndt ndt 


8 2 = 


n 2 E(d + t 2 ) 2500 tt 2 (0.85 x 10 6 )(tT + t 2 ) 


Tidt 


8(250) 2 


Tidt 8/ 

£3 = —d + 2.0 
g 4 = d- 14.0 
+ 0.2 

86 = t- 0-8 

The partial derivatives of the constraint functions can be computed as follows: 


dgl 

dyi 

dgl 

dyi 

dgi 

dy 4 

dgl 

dye 

dgl 

dyi 

dgl 

dyi 

dgl 

dyi 

dgl 

dys 

dgl 

dye 

dg3 

dyi 

dg3 

dye 

dg4 

dyi 


dgl 

dy 3 

1 

7T dt 
-1 


dgl 

dye 


= 0 


-j2 

71 d t 


2500 

7i d t 


dgl 

dy 4 

1 

Tidt 


= 0 


n~(d + t 2 ) n 2 (d + t 2 ) 

“ 500,000 

TZ 2 E{d" + t 2 ) 7 — 2 7 

-= 0.0136 7i 2 (d +t 2 ) 


41 

P n 2 E(2d) 


-ji 

Tid t 


8 r 


2500 

=T 

7 id t 


n 2 (3.4)d 


Y = 0 for i — 1 to 5 


Y = -10 


= 0 for i — 1 to 5 
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dg4 

dV6 

dg5 

dy,- 


Y = 10 

_ dg6 

Y “ d yi 


y = 0 for i = 1 to 6 


Since the value of the standard normal variate corresponding to the probability 

Pj — 0.95 is 1.645 (obtained from Table 11.1), the constraints in Eq. (11.98) can be 
expressed as follows. 


For j = l^: 


2500 

-500- 1.645 

ndt 


a 


■ Tt 


p , 2 , (2500) 2 2 -> 1/2 

T7 + °fy + — a '< 


2 dA 2 


n 


2 Tt 2 


< o 


795 / 25, 320 63.3\ 1/2 

— 500 — 1.645 ( — ^ + 2500 H — 5 — ) <0 

dt V d t 2 d"t 2 / 


For j — 2: 

2500 


ndt 


16.78 (d 2 + t 2 )- 1.645 


—2 


O 


n 2 d~t 2 


P + n 4 (d~ + t l )~o 


2 \ 2„2 


25 x 10 10 


2 1 1/2 

+ (0.01367r z )“(rT + r z ) z o/ + I + SAn^d ) a d 

\nd~t 


r 2<ln 2 , +2\2^,2 , /2500 ; ^^271 _2 


<0 


795 


dt 


— 16.78(r/ 2 + t 2 ) — 1.645 


25,320 -2 , , 

+ 2.82 (d + r) 2 


L d 2 t 2 


+ 0.113 (d 2 + t 2 ) 2 + 


63.20 -4 5.34 d 

— ^ F 0.1 126 d + - 

dt 2 


n 1/2 


< 0 


For j — 3: 

For j — 4: 

For j — 5: 
For j — 6: 


-d + 2.0 - 1 .645[(10“ 4 )rT] 1/2 < 0 
-1.01645^ + 2.0 < 0 


d - 14.0 - 1.645[(10“V 2 ] 1/2 < 0 
0.98335^-14.0 < 0 


-t + 0.2 < 0 

r - 0.8 <0 


(E10) 


(En) 


(E12) 

(E13) 

(E14) 

(E15) 


^The inequality sign is different from that of Eq. (11.98) due to the fact that the constraints are stated as 

P[g,-(Y) < 0] > Pj . 
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Thus the equivalent deterministic optimization problem can be stated as follows: Min- 
imize F(d, t) given by Eq. (E3) subject to the constraints given by Eqs. (E10) to (E15). 
The solution of the problem can be found by applying any of the standard nonlinear 
programming techniques discussed in Chapter 7. In the present case, since the number 
of design variables is only two, a graphical method can also be used to find the solution. 


11.5 STOCHASTIC GEOMETRIC PROGRAMMING 

The deterministic geometric programming problem has been considered in Chapter 8. If 
the constants involved in the posynomials are random variables, the chance-constrained 
programming methods discussed in Sections 11.3 and 11.4 can be applied to this 
problem. The probabilistic geometric programming problem can be stated as follows: 

Find X = {x\Xi ■ ■ ■ x„} T which minimizes /( Y ) 


subject to 


(11.99) 


PlgjCf ) > 0 ] > Pj, j — 1,2, ... ,m 

where Y = {yi, yi, ■ . . , }\y } T is the vector of N random variables (may include the 
variables x\, X 2 , ■ ■ ■ , x n ), and /(Y) and gj(Y), j — 1 , 2, .... /n, are posynomials. By 
expanding the objective function about the mean values of the random variables y t , 
y t , and retaining only the first two terms, we can express the mean and variance of 
/(Y) as in Eqs. (11.87) and (11.88). Thus the new objective function, F{ Y), can be 
expressed as in Eq. (11.89): 

C(Y) = k l f+k 2 cr^ (11.100) 


The probabilistic constraints of Eq. (11.99) can be converted into deterministic form 
as in Section 1 1.4: 


8 j ~ 


Y'' ( d ] 8j_ 

h v 9 * 


- 1 1/2 


>0, j = 1, 2 , . . . , m 


( 11 . 101 ) 


Thus the optimization problem of Eq. (11.99) can be stated equivalently as follows: 
Find Y which minimizes F( Y) given by Eq. (11.100) subject to the constraints of 
Eq. (11.101). The procedure is illustrated through the following example. 


Example 11.7 Design a helical spring for minimum weight subject to a constraint on 
the shear stress (r) induced in the spring under a compressive load P . 


SOLUTION By selecting the coil diameter ( D ) and wire diameter (d) of the spring 
as design variables, we have x\ — D and X 2 — d. The objective function can be stated 
in deterministic form as [11.14, 11.15]: 


/(X) = 


n 2 d 2 D 


(Ei) 


4 


(N c + Q)P 
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where N c is the number of active turns, Q the number of inactive turns, and p the 
weight density. Noting that the deflection of the spring (8) is given by 


<5 = 


8 PC 3 N c 
G el 


(E 2 ) 


where P is the load, C = D/d, and G is the shear modulus. By substituting the 
expression of N c given by Eq. (E 2 ) into Eq. (Ej ), the objective function can be expressed 
as 


/(X) = 


n 2 pG8 d 6 

32 p —fy 2 



(E 3 ) 


The yield constraint can be expressed, in deterministic form, as 


8 KPC 

7i d 2 


— Tnax 


(E 4 ) 


where r max is the maximum permissible value of shear stress and K the shear stress 
concentration factor given by (for 2 < C < 12): 


K = 


2 

C ^ 2 5 


(E 5 ) 


Using Eq. (E5), the constraint of Eq. (E4) can be rewritten as 


16P D' 


,0.75 


7T Trr 


d 2J5 


< 1 


(E 6 ) 


By considering the design variables to be normally distributed with (d, rr,/) — d( 1, 0.05) 
and (D,o d ) — £>(1,0.05), k\ — 1 and & 2 = 0 in Eq. (11.100) and using pj = 0.95, 
the problem [Eqs. (11.100) and (11.101)] can be stated as follows: 


Minimize F( Y) = 


-fi 


0.04l7T“p5G d 


D- 


O.TT&n 2 pQd 2 D 


(E 7 ) 


subject to 


— 0 75 

12.24 ply 


n t, 


max d 


■2.15 


< 1 


(Eg) 


The data are assumed as P — 5 ION, p — 78,000 N/wt 3 , 8 — 0.02 m, r max = 0.306 x 
10 9 Pa, and Q —2. The degree of difficulty of the problem can be seen to be zero and 
the normality and orthogonality conditions yield 


< 5 i + 82 — 1 


6.5 1 + 2 82 - 2.75<5 3 = 0 (Eg) 

— 28\ -p 82 5" 0.75<5 3 = 0 


The solution of Eqs. (Eg) gives <5i = 0.81, 82 — 0.19, and <5 3 = 1.9, which corresponds 
to d = 0.0053 m, T> = 0.0358 m, and / min = 2.266 N. 
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REVIEW QUESTIONS 

11.1 Define the following terms: 

(a) Mean 

(b) Variance 

(c) Standard deviation 

(d) Probability 

(e) Independent events 

(f) Joint density function 
(9) Covariance 

(h) Central limit theorem 

(i) Chance constrained programming 


11.2 Match the following terms and descriptions: 


(a) Marginal density function 

(b) Bivariate distribution 

(c) Normal distribution 

(d) Discrete distribution 

(e) Continuous distribution 


Describes sum of several random variables 
Described by probability density function 
Describes one random variable 
Describes two random variables 
Described by probability mass function 


11.3 Answer true or false: 

(a) The uniform distribution can be used to describe only continuous random variables. 

(b) The area under the probability density function can have any positive value. 

(c) The standard normal variate has zero mean and unit variance. 

(d) The magnitude of the correlation coefficient is bounded by one. 

(e) Chance constrained programming method can be used to solve only stochastic LP 
problems. 

(f) Chance constrained programming permits violation of constraints to some extent. 

(9) Chance constrained programming assumes the random variables to be normally dis- 
tributed. 

(h) The design variables need not be random in a stochastic programming problem. 

(i) Chance constrained programming always gives rise to a two-part objective function. 

(j ) Chance constrained programming converts a stochastic LP problem into a determin- 
stic LP problem. 

(k) Chance constrained programming converts a stochastic geometric programming 
problem into a deterministic geometric programming problem. 

(l) The introduction of random variables increases the number of state variables in 
stochastic dynamic programming. 

11.4 Explain the notation JV(/z, <x). 
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11.5 What is a random variable? 

11.6 Give two examples of random design parameters. 

11.7 What is the difference between probability density and probability distribution functions? 

11.8 What is the difference between discrete and continuous random variables? 

11.9 How does correlation coefficient relate two random variables? 

11.10 Identify possible random variables in a LP problem. 

11.11 How do you find the mean and standard deviation of a sum of several random variables? 


PROBLEMS 

11.1 A contractor plans to use four tractors to work on a project in a remote area. The 
probability of a tractor functioning for a year without a break-down is known to be 
80 %. If X denotes the number of tractors operating at the end of a year, determine the 
probability mass and distribution functions of X. 

11.2 The absolute value of the velocity of a molecule in a perfect gas ( V ) obeys the Maxwell 
distribution 

Ah 3 r. .2 2 

f v (v) = —v 2 e , v > 0 

y/7T 

where h 2 = (m/2kT) is a constant (m is the mass of the molecule, k is Boltzmann’s 
constant, and T is the absolute temperature). Find the mean and the standard deviation 
of the velocity of a molecule. 

11.3 Find the expected value and the standard deviation of the number of tractors operating 
at the end of one year in Problem 11.1. 

11.4 Mass-produced items always show random variation in their dimensions due to small 
unpredictable and uncontrollable disturbing influences. Suppose that the diameter, X, of 
the bolts manufactured in a production shop follow the distribution 

f x (x) = a(x — 0.9)(1.1 — x) for 0.9 < x < 1.1 
0 elsewhere 

Find the values of a, and er|. 

11.5 (a) The voltage V across a constant resistance R is known to fluctuate between 0 and 

2 volts. If V follows uniform distribution, what is the distribution of the power 
expended in the resistance? 

(b) Find the distribution of the instantaneous voltage ( V ) given by V = A cos (cot +4>), 
where A is a constant, a> the frequency, t the time, and 4> the random phase angle 
uniformly distributed from 0 to 2n radians. 


11.6 


The hydraulic head loss ( H ) in a pipe due to friction is given by the Darcy- Weisbach 
equation, 


H = f 


2gD 


where / is the friction factor, L the length of pipe, V the velocity of flow in pipe, g the 
acceleration due to gravity, and D the diameter of the pipe. If V follows exponential 
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11.7 


11.8 


11.9 


11.10 


11.11 


11.12 


distribution, 


fv(v) 


J-g-tv/Vo) for v > 0 
Vo 

0 for v < 0 


where Vo is the mean velocity, derive the density function for the head loss H. 
The joint density function of two random variables X and Y is given by 

[ 3x z y + 3 y z x for 0 < x < 1, 0 < y < 1 

fx,Y O, >0 = | 

0 elsewhere 


Find the marginal density functions of X and Y . 

Steel rods, manufactured with a nominal diameter of 3 cm, are considered acceptable 
if the diameter falls within the limits of 2.99 and 3.01cm. It is observed that about 5 
% are rejected oversize and 5 % are rejected undersize. Assuming that the diameters 
are normally distributed, find the standard deviation of the distribution. Compute the 
proportion of rejects if the permissible limits are changed to 2.985 and 3.015 cm. 

Determine whether the random variables X and Y are dependent or independent when 
their joint density function is given by 

{ 4xy for 0 < x < 1 , 0 < y < 1 

0 elsewhere 


Determine whether the random variables X and Y are dependent or independent when 
their joint density function is given by 


fx,r(x,y) 


1 

— -[1 — sin(.r + y)] for — jr < x < j r, — n < y < n 

Ait 1 

0 elsewhere 


The stress level at which steel yields (X) has been found to follow normal distribution. 
For a particular batch of steel, the mean and standard deviation of X are found to be 
4000 and 300kgf/cm 2 , respectively. Find 

(a) The probability that a steel bar taken from this batch will have a yield stress between 
3000 and 5000kg f /cm 2 

(b) The probability that the yield stress will exceed 4500 kgf/cm 2 

(c) The value of X at which the distribution function has a value of 0.10 

An automobile body is assembled using a large number of spot welds. The number of 
defective welds (X) closely follows the distribution 

e~ z 2 d 

P(X = d)= , d = 0,1,2,... 

d\ 

Find the probability that the number of defective welds is less than or equal to 2. 
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11.13 The range ( R ) of a projectile is given by 

V 2 

R = — sin 2 4> 

8 

where Vo is the initial velocity of the projectile, g the acceleration due to gravity, and cf> 
the angle from the horizontal as shown in Fig. 11.6. If the mean and standard deviations 
of Vo and 4> are given by Vo = lOOft/s, ay 0 = 10 ft/s, cf> = 30°, and a $ = 3°, find 
the first-order mean and standard deviation of the range R, assuming that Vo and 4> 
are statistically independent. Evaluate also the second-order mean range. Assume that 
g = 32.2 ft/s 2 . 

11.14 Maximize / = 4,vi + 2x2 + 3*3 + C4X4 


subject to 


x\ + X3 + X 4 < 24 
3xj + X 2 + 2x3 + 4x4 5 48 
2xi + 2x2 + 3x3 + 2x4 < 36 
x; >0, i = 1 to 4 


where C4 is a discrete random variable that can take values of 4, 5, 6, or 7 with probabil- 
ities of 0.1, 0.2, 0.3, and 0.4, respectively. Using the simplex method, find the solution 
that maximizes the expected value of /. 

11.15 Find the solution of Problem 11.14 if the objective is to maximize the variance of /. 

11.16 A manufacturing firm can produce 1, 2, or 3 units of a product in a month, but the 
demand is uncertain. The demand is a discrete random variable that can take a value of 
1 , 2, or 3 with probabilities 0.2, 0.2, and 0.6, respectively. If the unit cost of production 
is $400, unit revenue is $1000, and unit cost of unfulfilled demand is $0, determine the 
output that maximizes the expected total profit. 

11.17 A factory manufactures products A, B, and C. Each of these products is processed 
through three different production stages. The times required to manufacture 1 unit of 
each of the three products at different stages and the daily capacity of the stages are 
probabilistic with means and standard deviations as indicated below. 



Figure 11.6 Range of a projectile. 
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Time per unit (min) for product: 


Stage capacity 
(mins/day) 



A 


B 


C 

Stage 

Mean 

Standard 

deviation 

Mean 

Standard 

deviation 

Mean 

Standard 

deviation 

Mean 

Standard 

deviation 

1 

4 

1 

8 

3 

4 

4 

1720 

172 

2 

12 

2 

0 

0 

8 

2 

1840 

276 

3 

4 

2 

16 

4 

0 

0 

1680 

336 


The profit per unit is also a random variable with the following data: 


Product 


Profit($) 


Mean 

Standard deviation 

A 

6 


2 

B 

4 


1 

C 

10 


3 


Assuming that all amounts produced are absorbed by the market, determine the daily 
number of units to be manufactured of each product for the following cases. 

(a) The objective is to maximize the expected profit. 

(b) The objective is to maximize the standard deviation of the profit. 

(c) The objective is to maximize the sum of expected profit and the standard deviation 
of the profit. 

Assume that all the random variables follow normal distribution and the constraints have 
to be satisfied with a probability of 0.95. 

11.18 In a belt-and-pulley drive, the belt embraces the shorter pulley 165° and runs over it 
at a mean speed of 1700m/min with a standard deviation of 51m/min. The density of 
the belt has a mean value of 1 g/cm 3 and a standard deviation of 0.05 g/cm 3 . The mean 
and standard deviations of the permissible stress in the belt are 25 and 2.5kg^/cm 2 . 
respectively. The coefficient of friction (/z) between the belt and the pulley is given 
by JZ = 0.25 and er M = 0.05. Assuming a coefficient of variation of 0.02 for the belt 
dimensions, find the width and thickness of the belt to maximize the mean horsepower 
transmitted. The minimum permissible values for the width and the thickness of the belt 
are 10.0 and 0.5 cm, respectively. Assume that all the random variables follow normal 
distribution and the constraints have to be satisfied with a minimum probability of 0.95. 
Hint: Horsepower transmitted = (7j — 72)v/75, where 7j and 73 are the tensions on 
the tight side and slack sides of the belt in kg y and v is the linear velocity of the belt 
in m/s: 

wv 2 T 1 „ 

T\ = Tmax - T c = T max and — = 

8 T 2 

where T max is the maximum permissible tension, T c the centrifugal tension, w the weight 
of the belt per meter length, g the acceleration due to gravity in m/s, and 9 the angle of 
contact between the belt and the pulley. 
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11.19 An article is to be restocked every three months in a year. The quarterly demand U is 
random and its law of probability in any of the quarters is as given below: 


u 

Probability mass function, Py(u) 

0 

0.2 

1 

0.3 

2 

0.4 

3 

0.1 

>3 

0.0 


The cost of stocking an article for a unit of time is 4, and when the stock is exhausted, 
there is a scarcity charge of 12. The orders that are not satisfied are lost, in other words, 
are not carried forward to the next period. Further, the stock cannot exceed three articles, 
owing to the restrictions on space. Find the annual policy of restocking the article so as 
to minimize the expected value of the sum of the cost of stocking and of the scarcity 
charge. 

11.20 A close-coiled helical spring, made up of a circular wire of diameter d, is to be designed 
to carry a compressive load P. The permissible shear stress is cr max and the permissible 
deflection is <5 max - The number of active turns of the spring is n and the solid height of 
the spring has to be greater than h. Formulate the problem of minimizing the volume 
of the material so as to satisfy the constraints with a minimum probability of p. Take 
the mean diameter of the coils ( D ) and the diameter of the wire ( d ) as design variables. 
Assume d, D, P, cr max , S m ax , /(, and the shear modulus of the material, G, to be normally 
distributed random variables. The coefficient of variation of d and D is k. The maximum 
shear stress, a , induced in the spring is given by 

8 PDK 

where K is the Wahl’s stress factor defined by 

4 D-d 0.615 d 

K = 1 

4 (D-d) D 

and the deflection ( S ) by 

„ 8 PD 3 n 

5 = ; — 

Gd 4 

Formulate the optimization problem for the following data: 

G = A (840. 000, 84.000) kg^cm 2 , <5 max = A (2, 0.1) cm, 

<W = A (3000, 150) kgy/cm 2 , 

P = A(12, 3) kgy-, n = 8, h = A(2.0, 0.4) cm, k = 0.05, 
p = 0.99 


11.21 Solve Problem 1 1.20 using a graphical technique. 
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Optimal Control and Optimality 
C riteria M ethods 


12.1 INTRODUCTION 


In this chapter we give a brief introduction to the following techniques of optimization: 

1. Calculus of variations 

2. Optimal control theory 

3. Optimality criteria methods 


If an optimization problem involves the minimization (or maximization) of a functional 
subject to the constraints of the same type, the decision variable will not be a number, 
but it will be a function. The calculus of variations can be used to solve this type of 
optimization problems. An optimization problem that is closely related to the calculus of 
variations problem is the optimal control problem. An optimal control problem involves 
two types of variables: the control and state variables, which are related to each other 
by a set of differential equations. Optimal control theory can be used for solving such 
problems. In some optimization problems, especially those related to structural design, 
the necessary conditions of optimality, for specialized design conditions, are used to 
develop efficient iterative techniques to find the optimum solution. Such techniques are 
known as optimality criteria methods. 


12.2 CALCULUS OF VARIATIONS 
12.2.1 Introduction 

The calculus of variations is concerned with the determination of extrema (maxima and 
minima) or stationary values of functionals. A functional can be defined as a function of 
several other functions. Hence the calculus of variations can be used to solve trajectory 
optimization problems. t The subject of calculus of variations is almost as old as the 
calculus itself. The foundations of this subject were laid down by Bernoulli brothers and 
later important contributions were made by Euler, Lagrange, Weirstrass, Hamilton, and 
Bolzane. The calculus of variations is a powerful method for the solution of problems in 
several fields, such as statics and dynamics of rigid bodies, general elasticity, vibrations, 
optics, and optimization of orbits and controls. We shall see some of the fundamental 
concepts of calculus of variations in this section. 

^See Section 1.5 for the definition of a trajectory optimization problem. 
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12.2.2 Problem of C alculus of Variations 


A simple problem in the theory of the calculus of variations with no constraints can 
be stated as follows: 

Find a function u(x ) that minimizes the functional (integral) 

f* 2 

F(x, u, u ' , u") dx (12.1) 

Jxi 


where A and F can be called functionals (functions of other functions). Here x is the 
independent variable, 


u = u(x). 


du(x) 

dx 


and u" 


d 2 u{x) 

dx 2 


In mechanics, the functional usually possesses a clear physical meaning. For example, 
in the mechanics of deformable solids, the potential energy (n) plays the role of the 
functional (rr is a function of the displacement components u. v, and w, which, in turn, 
are functions of the coordinates x, y, and z). 

The integral in Eq. (12.1) is defined in the region or domain [xj , x?]. Let the values 
of u be prescribed on the boundaries as u(x i) = u\ and u(x 2 ) = « 2 - These are called 
the boundary conditions of the problem. One of the procedures that can be used to 
solve the problem in Eq. (12.1) will be as follows: 

1. Select a series of trial or tentative solutions u(x) for the given problem and 
express the functional A in terms of each of the tentative solutions. 

2. Compare the values of A given by the different tentative solutions. 

3. Find the correct solution to the problem as that particular tentative solution 
which makes the functional A assume an extreme or stationary value. 

The mathematical procedure used to select the correct solution from a number of 
tentative solutions is called the calculus of variations. 


Stationary Values Of F unctionals. Any tentative solution u(x) in the neighborhood 
of the exact solution u(x) may be represented as (Fig. 12.1) 


u(x) — u(x) + Su(x) 

tentative exact variation 
solution solution of u 


( 12 . 2 ) 


The variation in u (i.e., Su) is defined as an infinitesimal, arbitrary change in u for a 
fixed value of the variable x (i.e., for 8x = 0). Here S is called the variational operator 
(similar to the differential operator d). The operation of variation is commutative with 
both integration and differentiation, that is, 


f Fdx ) = f 


c SF ) dx 


dx 


(Su) 


(12.3) 


8 


du 

dx 


(12.4) 
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u(x) 



u{x) I 


r7{ Tentative 
solution 


X 


Figure 12.1 Tentative and exact solutions. 


X X2 


Also, we define the variation of a function of several variables or a functional in a 
manner similar to the calculus definition of a total differential: 


(since we are finding variation of F for a fixed value of x, i.e., 8x — 0). 

Now, let us consider the variation in A(8A) corresponding to variations in the 
solution ( 8u ). If we want the condition for the stationariness of A, we take the nec- 
essary condition as the vanishing of first derivative of A (similar to maximization or 
minimization of simple functions in ordinary calculus). 



(12.5) 


t 

0 



( 12 . 6 ) 


Integrate the second and third terms by parts to obtain 




(12.7) 






( 12 . 8 ) 
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Thus 


SA 


r X2 [dF _ d /3F\ d 2 / dF 
] Xl 3m dx ySw'/ r/x 2 \3 u" 

r dF c 

+ Idu' ~ d 


Su dx 


dF V 


- 

x 2 

( dF\ , 

Su 

+ 

\Su' 

_ 

Xl 

\du" 


*2 


= 0 


XI 


Since Su is arbitrary, each term must vanish individually: 

d 
dx 
'dF 
3 M 7 


3F 

3m 


3F\ j 2 / \ _ Q 
3m'/ dx 2 \3 u" ) 


d 


3 F\ 
dx V du" ) 


Su 


x 2 


= 0 


dF 

3m" 


Su' 


= 0 


(12.9) 

( 12 . 10 ) 

( 12 . 11 ) 

( 12 . 12 ) 


X\ 


Equation (12.10) will be the governing differential equation for the given problem and 
is called Euler equation or Euler-Lagrange equation. Equations (12.11) and (12.12) 
give the boundary conditions. 


The conditions 


dF 

~du' 


d (dFY 
dx \3m"/_ 

dF 

3m" 


*2 


= 0 


= 0 


(12.13) 

(12.14) 


X\ 


are called natural boundary conditions (if they are satisfied, they are called free bound- 
ary conditions). If the natural boundary conditions are not satisfied, we should have 


Su(xi) — 0, 
Su'(x i) = 0, 


Su(x 2) = 0 
8u'(x 2) = 0 


(12.15) 

(12.16) 


in order to satisfy Eqs. (12.11) and (12.12). These are called geometric or forced 
boundary conditions. 


Example 12.1 Brachistochrone Problem In June 1696, Johann Bernoulli set the fol- 
lowing problem before the scholars of his time. “Given two points A and B in a 
vertical plane, find the path from A to B along which a particle of mass m will slide 
under the force of gravity, without friction, in the shortest time” (Fig. 12.2). The term 
brachistochrone derives from the Greek brachistos (shortest) and chronos (time). 

If 5 is the distance along the path and v the velocity, we have 


ds _ (dx 2 +dy 2 ) 1/2 
dt dt 

dt = — (1 + (/) 2 ] 1/2 dx 

V 


[1 + (/) 2 ] 1/2 


dx 


dt 
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Figure 12.2 Curve of minimum time of 
descent. 


Since potential energy is converted to kinetic energy as the particle moves down the 
path, we can write 

\mv 2 — mgx 


Hence 

c it 

and the integral to be stationary is 


■ 1 + 0^1 1/2 

. 2g.r _ 


dx 


= f 


'\2 


' !+(/) 

. 2 8 * . 


1/2 


dx 


(Ei) 


(E 2 ) 


The integrand is a function of x and y' and so is a special case of Eq. (12.1). Using 
the Euler-Lagrange equation. 


d ( dF\ dF 


dx \ 3 y' ) By 


'\2 


= 0 with F = 


' !+(/) 

. 2 gx _ 


1/2 


we obtain 


d 

dx V, Ml + (y ') 2 ]} 1/2 


= 0 


Integrating yields 


dy_ _ t C\x \ 


1/2 


dx \ 1 — C\x J 


(E 3 ) 


where C\ is a constant of integration. The ordinary differential equation (E 3 ) yields on 
integration the solution to the problem as 

y(x) = Ci sin^Ot/Cj) - (2 Cjx - x 2 ) 1/2 + C 2 (E 4 ) 


Example 12.2 Design of a Solid Body of Revolution for Minimum Drag Next we 
consider the problem of determining the shape of a solid body of revolution for mini- 
mum drag. In the general case, the forces exerted on a solid body translating in a fluid 
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depend on the shape of the body and the relative velocity in a very complex manner. 
However, if the density of the fluid is sufficiently small, the normal pressure (p) acting 
on the solid body can be approximately taken as [12.3] 

p — 2 pv 2 sin 2 9 (Ei) 

where p is the density of the fluid, v the velocity of the fluid relative to the solid body, 
and 9 the angle between the direction of the velocity of the fluid and the tangent to the 
surface as shown in Fig. 12.3. 

Since the pressure (p) acts normal to the surface, the x-component of the force 
acting on the surface of a slice of length dx and radius y(x) shown in Fig. 12.4 can 
be written as 


dP — (normal pressure) (surface area) sin 6 

— (2pv 2 &\rr 9){2ny -J~\ + ( y ') 2 dx) sin# (E 2 ) 

where y' — dy/dx. The total drag force, P, is given by the integral of Eq. (E?) as 

r L 

P — 4izpv 2 y sin 3 9y / 1 + (y ') 2 dx (E 3 ) 

Jo 

where L is the length of the body. To simplify the calculations, we assume that y' 1 
so that 


■ „ y' ^ , 

sin 9 — — ~ y 

yi+w 

Thus Eq. (E 3 ) can be approximated as 

P = 4npv 2 [ (y'fy dx 

Jo 

Now the minimum drag problem can be stated as follows. 


(E 4 ) 


(E 5 ) 


y 



Figure 12.3 Solid body of revolution translating in a fluid medium. 
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dW 1 + [y'f 



Figure 12.4 Element of surface area acted on by the 
pressure p. 


Find y(x) which minimizes the drag P given by Eq. (E5) subject to the condition 
that y(x ) satisfies the end conditions 


y(x — 0) = 0 and y(x — L) — R (Eg) 

By comparing the functional P of Eq. (E5) with A of Eq. (12.1), we find that 

F{x, y, y’, y") = 4jipv 1 iy') i y (E v ) 


The Euler-Lagrange equation, Eq. (12.10), corresponding to this functional can be 
obtained as 


(/) 3 - 3 — t.y(/) 2 ] = 0 
ax 


(Eg) 


The boundary conditions, Eqs. (12.11) and (12.12), reduce to 


[3 yiy'fWy 


xi—L 


*1=0 


(Eg) 


Equation (Eg) can be written as 

(/) 3 - 3 [y '(/) 2 + >’( 2 )y'/'] = 0 


or 


(y '? + lyy'y" = o 

This equation, when integrated once, gives 

y(y ') 3 - k\ 


(E10) 

(Ell) 


where k\ is a constant of integration. Integrating Eq. (En), we obtain 

y(x) - (kiX +k 2 ) 3/4 (E12) 

The application of the boundary conditions, Eqs. (Eg), gives the values of the 
constants as 

R E3 

k 1 = and k 2 = 0 

L 
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Hence the shape of the solid body having minimum drag is given by the equation 


y(x) = R 



3/4 


12.2.3 L agrange M uitipliers and C onstraints 

If the variable x is not completely independent but has to satisfy some condition(s) of 
constraint, the problem can be stated as follows: 

Find the function y(x) such that the integral 


4 = / 

subject to the constraint 


x >( dy\ 

F I x, y, — — ) ax — > minimum 


XI 


dx 


s ^. y ,f\=a 


(12.17) 


where g may be an integral function. The stationary value of a constrained calculus of 
variations problem can be found by the use of Lagrange multipliers. To illustrate the 
method, let us consider a problem known as isoperimetric problem given below. 


Example 12.3 Optimum Design of a Cooling Fin Cooling fins are used on radiators 
to increase the rate of heat transfer from a hot surface (wall) to the surrounding 
fluid. Often, we will be interested in finding the optimum tapering of a fin (of 
rectangular cross section) of specified total mass which transfers the maximum 
heat energy. 

The configuration of the fin is shown in Fig. 12.5. If Tq and denote the wall 
and the ambient temperatures, respectively, the temperature of the fin at any point, 
T (x), can be nondimensionalized as 


t(x) = 


T (x) - Tqq 
To ~ ^ 


(Hi) 


so that f(0) = 1 and f(oo) = 0. 


y Ambient temperature = T „ or = 0 

dxxj 1 + (y') 2 Heat overflow 

by convection 


Temperature 
of wall = T 0 
t 0 = 1 



Figure 12.5 Geometry of a cooling fin. 
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To formulate the problem, we first write the heat balance equation for an elemental 
length, dx, of the fin: 


heat inflow by conduction = heat outflow by conduction and convection 


that is. 


dt\ 

-kA — = 

dx J x 


dt \ 
-kA — 
dx ) 


+ hS (f - too) 


(E 2 ) 


x+dx 


where k is the thermal conductivity, A the cross-sectional area of the fin = 2 y(x) 
per unit width of the fin, h the heat transfer coefficient, S the surface area of the fin 
element = 2^1 + (y') 2 dx per unit width, and 2 y(x) the depth of the fin at any section 
x. By writing 


-kA*) 

dx / x+dx 


, clt\ d ( ,.dt\ 

—kA — I H I — kA — I dx 

dx J r dx \ dx ) 


and noting that too — 0 , we can simplify Eq. (E 2 ) as 

d 
dx 


= ht ^ 1 + ^ 2 


Assuming that y' <<C 1 for simplicity, this equation can be written as 

d ( dt\ 
k — I y — ) = ht 
dx \ dx J 


(E 3 ) 


(E 4 ) 


(E 5 ) 


The amount of heat dissipated from the fin to the surroundings per unit time is 
given by 


H — 2 f ht dx 
Jo 


(E 6 ) 


by assuming that the heat flow from the free end of the fin is zero. Since the mass of 
the fin is specified as m, we have 




py dx — m = 0 


(E 7 ) 


where p is the density of fin. 

Now the problem can be stated as follows: Find t{x) that maximizes the integral 
in Eq. (Eg) subject to the constraint equation (E7). Since y(x) in Eq. (E 7 ) is also not 
known, it can be expressed in terms of t(x) using the heat balance equation (E5). By 
integrating Eq. (E5) between the limits x and L, we obtain 


dt f 1 

-ky(x)—(x) = h 

dx J x 


t(x) dx 


(Eg) 


by assuming the heat flow from the free end to be zero. Equation (Eg) gives 

-L 

t(x) dx (E9) 


h 1 

y(x) = -- 


k dt /dx 


L 
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By substituting Eq. (Eg) in (E7), the variational problem can be restated as 
Find y(x) which maximizes 

-z. 


subject to the constraint 


g(x, t, t') = 2 p ■ 


h 


H = 2 h f 
Jo 

r-L-u 

Jo dt/dx L J x 


t(x) dx 


t(x) dx 


dx + m — 0 


(E10) 


(En) 


This problem can be solved by using the Lagrange multiplier method. The functional 
/ to be extremized is given by 


I 


= f (H + kg) dx — 2h f 
Jo Jo . 


t(x) 


kp 1 


k dt/dx 

where k is the Lagrange multiplier. 

By comparing Eq. (E12) with Eq. ( 12 . 1 ) we find that 


f 


t(x) dx 


dx 


(E12) 


, 2hkp 1 f L 

F(x, t, t ) — 2 ht H / t(x) dx 


The Euler-Lagrange equation, Eq. ( 12 . 10 ), gives 


h — 


klrp 


2 r 

It 


t" r 


L t(x) 

t(x) dx + 


t(x ) f x dx 

W~Jo 


0 


(E13) 


(E14) 


This integrodifferential equation has to be solved to find the solution t(x). In this case 
we can verify that 


t(x) 


— ‘(t) 


1/2 


(E15) 


satisfies Eq. (E14). The thickness profile of the fin can be obtained from Eq. (Eg) as 


yW 


h 1 f L h ( k \ 1/2 f L 

= -k?l = i 


I) 


1/2 


dx 


h 


( kkp )'/ 2 


- ( t) + T 


= C1 + C2X + CjX~ 


where 


h 


Cl 


( kkp )'/ 2 
h 




C2 = ~ 


C 3 


(kkp) 1 / 2 

h / kp\ 1 ^ 2 h 
~ 2k 


2 (kkp) 1 / 2 V k ) 


(Eie) 

(En) 

(Eis) 

(Eig) 


678 


Optimal Control and Optimality Criteria Methods 


The value of the unknown constant X can be found by using Eq. (E 7 ) as 

-L 


=2 P r 

Jo 


L 2 L 3 

y{x) dx = 2 p ( c\L + c 2 — + c 3 — 


that is. 


1 hL 2 


m L L 2 hL 

2pL ~ Cl +Cl 2 +C3 T ~~ 2(kpX) 1 / 2 _ 3~k 


Equation (E 20 ) gives 


X l ' 2 


hL 


(E20) 


(kp) 1 ' 2 (m/pL) + l(hL 2 /k ) 

Hence the desired solution can be obtained by substituting Eq. (E 21 ) in Eq. (Em). 


(E 21 ) 


12.2.4 Generalization 


The concept of including constraints can be generalized as follows. Let the problem be 
to find the functions m(x, y, z), W 2 O 0 y, z) u n (x, y, z) that make the functional 


L 


du 1 


/ y, z, Mi, m 2 , . . . , u n , — , ...)dV 


stationary subject to the m constraints 


g 1 x, y, z, Ml, m 2 , . . . , m„, 


gm x, y, z, MI, M 2 , . . . , M„, 


du\ 

dx 

du\ 

dx 


= 0 


= 0 


(12.18) 


(12.19) 


The Lagrange multiplier method consists in taking variations in the functional 

A— f (f + X\gi + X2g2 + ■■■ + X m g m ) dV (12.20) 

Jv 

where A.,- are now functions of position. In the special case where one or more of the 
gi are integral conditions, the associated X t are constants. 
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The basic optimal control problem can be stated as follows: 

Ml 

U2 

u m 


Find the control vector u 
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which minimizes the functional, called the performance index. 


where 


J = 



/o(X, U ,t)dt 


x\ 

X2 

X n 


( 12 . 21 ) 


is called the state vector, t the time parameter, T the terminal time, and /o is a function 
of X, u, and t. The state variables Xj and the control variables n, are related as 


or 


dxj 

dt 


fi(x i, X 2 , ■ ■ ■ , x n ; M], u 2 , . . . , u m \ t), i = 1,2, ... ,n 


x = f(x, U,0 (12.22) 

In many problems, the system is linear and Eq. (12.22) can be stated as 

X=[A]X+[5]U (12.23) 

where [A] is an n x n matrix and [5] is an n x m matrix. Further, while finding the 
control vector u, the state vector X is to be transferred from a known initial vector Xo 
at t — 0 to a terminal vector Xy at t = T, where some (or all or none) of the state 
variables are specified. 


12.3.1 Necessary Conditions for Optimal Control 

To derive the necessary conditions for the optimal control, we consider the following 
simple problem: 


Find u which minimizes J — 



fo(x, u,t) dt 


(12.24) 


subject to 


x — f(x,u,t) (12.25) 

with the boundary condition jt(0) = k i . To solve this optimal control problem, we 
introduce a Fagrange multiplier A and define an augmented functional J* as 

J* — f {fo(x, u, t) + X[f(x, u, t) — x]} dt (12.26) 

Jo 

Since the integrand 


F = f 0 + X(f-x) 


(12.27) 
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is a function of the two variables x and u, we can write the Euler-Lagrange equations 
[with Mi = x, Mj = dx/dt — x, U 2 — u and «!, = du/dt — it in Eq. (12.10)] as 


dF d / dF\ _ 
dx dt \dx ) 
dF d / dF\ _ 
du dt \ dit ) 


(12.28) 

(12.29) 


In view of relation (12.27), Eqs. (12.28) and (12.29) can be expressed as 



f! + 4 + i=0 

(12.30) 


dx dx 


df 3/ 

— +x— =0 

(12.31) 


du du 

A new functional H , 

called the Hamiltonian, is defined as 



H = f 0 + Xf 

(12.32) 

and Eqs. (12.30) and 

(12.31) can be rewritten as 



dH ^ 

(12.33) 


dx 


— =0 
du 

(12.34) 


Equations (12.33) and (12.34) represent two first-order differential equations. The inte- 
gration of these equations leads to two constants whose values can be found from the 
known boundary conditions of the problem. If two boundary conditions are specified 
as x(0) = k\ and x(T) — U, the two integration constants can be evaluated without 
any difficulty. On the other hand, if only one boundary condition is specified as, say, 
jc(0) = k\, the free-end condition is used as dF/dx — 0 or X = 0 at t = T . 


E xample 12.4 Find the optimal control u that makes the functional 
stationary with 


(x 2 + u 2 ) dt 


X — ll 

and ,\ (0) = 1. Note that the value of x is not specified at / = 1. 

SOLUTION The Hamiltonian can be expressed as 

H — f{) 4" 7.M ~ x~ u~ -)- ~Kh 


(Ei) 

(E 2 ) 

(E 3 ) 


and Eqs. (12.33) and (12.34) give 


— 2x = X 
2m T X = 0 


(E 4 ) 

(E 5 ) 
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Differentiation of Eq. (E5) leads to 


2 u — 0 

(Eg) 

Equations (E4) and (Eg) yield 


u — X 

(Ey) 

Since x = u [Eq. (E2)], we obtain 


x = u = X 


that is, 


x — x — 0 

(Eg) 

The solution of Eq. (Eg) is given by 


x{t) — ci sinh t + ci cosh t 

(E 9 ) 


where c i and ct are constants. By using the initial condition x( 0 ) — I . we obtain C2 = 1 . 
Since x is not fixed at the terminal point t = T = 1 , we use the condition /. = 0 at 
t — 1 in Eq. (E5) and obtain u(t = 1 ) = 0 . But 

u — x = c\ cosh r + sinh t (E10) 

Thus 

u{\) — 0 — c\ cosh 1 + sinh 1 


or 


c 1 


— sinh 1 

cosh 1 


(E11) 


and hence the optimal control is 
— sinh 1 


u(t ) 


• cosh t + sinh t 


cosh 1 

— sinh 1 • cosh t + cosh 1 ■ sinh t 
cosh 1 

The corresponding state trajectory is given by 

cosh (1 — t) 


— sinh (1 — t) 
cosh 1 


x(t) — u — 


cosh 1 


(E12) 


(E13) 


12.3.2 Necessary Conditions for a General Problem 

We shall now consider the basic optimal control problem stated earlier: 
Find the optimal control vector u that minimizes 

J — f /o(x, u, t)dt 

Jo 


( 12 . 35 ) 


subject to 


jq = fi (X, U, t), i — 1 , 2 , ... ,n 


( 12 . 36 ) 
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Now we introduce a Lagrange multiplier p- t , also known as the adjoint variable, for 
the ith constraint equation in (12.36) and form an augmented functional J* as 


= f 


fo + ^T / Pi(fi -Xi) 


dt 


(=t 


The Hamiltonian functional, H, is defined as 

n 

H = /o + p ‘ f‘ 


i=i 


such that 


Since the integrand 




F = H - ^ piki 


dt 


(12.37) 


(12.38) 


(12.39) 


(12.40) 


i=i 


depends on X, u, and t , there are n + m dependent variables (X and u) and hence the 
Euler-Lagrange equations become 


(12.41) 

(12.42) 


dF 

d 

/ dF 

dxi 

dt 

V9i; 

dF 

d 1 

s dF 

dlij 

dt \ 

ydUj 


In view of relation (12.40), Eqs. (12.41) and (12.42) can be rewritten as 

dH 

- Pi, i = 1, 2, . . . , n 

0 Xi 


dH 

dui 


0, j = 1, 2, . . . , m 


(12.43) 

(12.44) 


Equations (12.43) are knowns as adjoint equations . 

The optimum solutions for X, u, and p can be obtained by solving Eqs. (12.36), 
(12.43), and (12.44). There are totally 2 n + m equations with nxjs, npjs, and mu f s 
as unknowns. If we know the initial conditions x,(0), i — 1,2 and the terminal 
conditions Xj(T), j = 1.2,...,/, with / < n, we will have the terminal values of the 
remaining variables, namely xj(T), j — l + 1, / + 2, . . . , n, free. Hence we will have 
to use the free end conditions 


Pj {T) — 0, j = / + !,/ + 2, (12.45) 


Equations (12.45) are called the transversality conditions. 
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12.4 OPTIMALITY CRITERIA METHODS 

The optimality criteria methods are based on the derivation of an appropriate criteria for 
specialized design conditions and developing an iterative procedure to find the optimum 
design. The optimality criteria methods were originally developed by Prager and his 
associates for distributed (continuous) systems [12.6] and extended by Venkayya, Khot, 
and Berke for discrete systems [12.7-12.10]. The methods were first presented for 
linear elastic structures with stress and displacement constraints and later extended to 
problems with other types of constraints. We will present the basic approach using only 
displacement constraints. 


12.4.1 Optimality Criteria with a Single Displacement Constraint 

Let the optimization problem be stated as follows: 

n 

Find X which minimizes /(X) = (12.46) 

i=i 

subject to 

Y- = V’max (12.47) 

where c, are constants, y max is the maximum permissible displacement, and a t depends 
on the force induced in member i due to the applied loads, length of member i, and 
Young’s modulus of member i. The Lagrangian function can be defined as 

L(X, A) = c ‘ x ‘ + x J (12.48) 

At the optimum solution, we have 

3 L a k J4 1 3 cij 

— = C(t -k4+kV--^ =0, k — 1,2, ... ,n (12.49) 

dx k x£ “ Xj dx k 


It can be shown that the last term in Eq. (12.49) is zero for statically determinate as 
well as indeterminate structures [12.8] so that Eq. (12.49) reduces to 

c k -k% = 0, k= 1,2, ...,n (12.50) 

4 

or 

X= C J4l (12.51) 

a k 


Equation (12.51) indicates that the quantity c k x\la k is the same for all the design 
variables. If all the design variables are to be changed, this relation can be used. 
However, in practice, only a subset of design variables are involved in Eq. (12.49). 
Thus it is convenient to divide the design variables into two sets: active variables [those 
determined by the displacement constraint of Eq. (12.51)] and passive variables (those 
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determined by other considerations). Assuming that the first n variables denote the 
active variables, we can rewrite Eqs. (12.46) and (12.47) as 

n 

/ = / + !>*’■ (12 - 52) 
i=i 

n 

Y~ = >'max - 7 = y* (12.53) 

x i 


where / and y denote the contribution of the passive variables to / and y, respectively. 
Equation (12.51) now gives 

Xk = spk\—, k = 1, 2, . . . , n (12.54) 

V c k 

Substituting Eq. (12.54) into Eq. (12.53), and solving for X, we obtain 


1 " 

Vx = — 


k= 1 


Using Eq. (12.55) in Eq. (12.54) results in 


(12.55) 


x k = 


■ y; y/ajCj, k— l,2,...,i 


i=i 


(12.56) 


Equation (12.56) is the optimality criteria that must be satisfied at the optimum solu- 
tion of the problem stated by Eqs. (12.46) and (12.47). This equation can be used to 
iteratively update the design variables x k as 


r O'+i) _ 
x k ~ 



k — 1,2 


(12.57) 


where the superscript j denotes the iteration cycle. In each iteration, the components 
a k and c k are assumed to be constants (in general, they depend on the design vector). 


12.4.2 Optimality Criteria with Multiple Displacement Constraints 

When multiple displacement constraints are included, as in the case of a structure sub- 
jected to multiple-load conditions, the optimization problem can be stated as follows: 

Find a set of active variables X = {x\ X 2 . . . a'tt} 1 which minimizes 


subject to 


n 

f(X) = f 0 + J2ciXi 

i = 1 





(12.58) 


j — 1 , 2 ,...,/ 


(12.59) 
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where J denotes the number of displacement (equality) constraints, y* the maximum 
permissible value of the displacement y j , and a is a parameter that depends on the 
force induced in member i due to the applied loads, length of member i, and Young’s 
modulus of member The Lagrangian function corresponding to Eqs. (12.58) and 
(12.59) can be expressed as 


L(X, A. i , . . . , Xj) — fo + — 

f=i j = i \i=i Xi 

and the necessary conditions of optimality are given by 

j 

= Cl — 

dx k 


3 T x — ' ci j j 

= Ck — X j — T = 0, k = 1 , 2, . . . , n 

^ 1 x 2 


j = 1 k 

Equations (12.61) can be rewritten as 


x k 


i 


I 1/2 


E( x j- 

Ck 


k — 1,2 ,...,» 


Note that Eq. (12.62) can be used to iteratively update the variable x* as 


„C/+i) 


n I / 2 


U' Ck 


O') 


k = 1, 2, . . . , n 


(12.60) 


(12.61) 


(12.62) 


(12.63) 


where the values of the Lagrange multipliers Xj are also not known at the beginning. 
Several computational methods can be used to solve Eqs. (12.63) [12.7, 12.8]. 


12.4.3 Reciprocal Approximations 

In some structural optimization problems, it is convenient and useful to consider the 
reciprocals of member cross-sectional areas (1/A, ) as the new design variables (z ; ). If 
the problem deals with the minimization of weight of a statically determinate structure 
subject to displacement or stress constraints, the objective function and its gradient 
can be expressed as explicit functions of the variables zi and the constraints can be 
expressed as linear functions of the variables zi- If the structure is statically indeter- 
minate, the objective function remains a simple function of zi but the constraints may 
not be linear in terms of z,-; however, a first-order Taylor series (linear) approximation 
of the constraints denote a very high-quality approximation of these constraints. With 
reciprocal variables, the optimization problem with a single displacement constraint 
can be stated as follows: 

Find Z — {z\ zi ■ ■ ■ ZnV which minimizes /( Z) (12.64) 


subject to 


g( Z) = 0 


(12.65) 
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The necessary condition of optimality can be expressed as 


9/ dg 

— + X — 

dZi dZi 


— 0 , 


i — 1,2, ... ,n 


( 12 . 66 ) 


Assuming / to be linear in terms of the areas of cross section (original variables, 
x, — Aj ) and g to be linear in terms of Zi, we have 


dj_ = dj_dXi_ 

dZi dxj dzi 


and Eqs. (12.66) and (12.67) yield 


x, = 


/, dg/dZi 
V df/dxi 


1/2 


\_df_ 

z'j dxi 


i — 1,2 , ... ,n 


(12.67) 


( 12 . 68 ) 


To find X we first find the linear approximation of g at a reference point (trial design) 
Z 0 (or X 0 ) as 


8( Z) 


g(Z Q ) + e 

i = 1 


3g_ 

dZi 


( Zi ~ Zoi ) 

Zo 


i=i 


dg_ 

3 Zi 



where 


go 


g( Z 0 )-E 

/=! 


3 8_ 

dZi 


n 

Z0i = g(Xo) + E 
Zo i=l 


3 g 

3X; 



(12.69) 


(12.70) 


and zo; is the f th component of Zo with vy, = l/zoi- By setting Eq. (12.69) equal to 
zero and substituting Eq. (12.68) for jq, we obtain 


X = 



(12.71) 


Equations (12.71) and (12.68) can now be used iteratively to find the optimal solution 
of the problem. The procedure is explained through the following example. 


Example 12.5 The problem of minimum weight design subject to a constraint on the 
vertical displacement of node S(Ui) of the three-bar truss shown in Fig. 12.6 can be 
stated as follows: 


Find X = } which minimizes 

/(X) = p( 2 V2 l)x 1 + plx 2 = 80.0445xi + 28.3x2 


subject to 


or 


U 1 


U n 


- 1 < 0 


g(X) = 


Xi + sflX2 


- 1 < 0 


(Hi) 


(Ho) 
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/ = ioo" — i = too' 



where p is the weight density, E is Young’s modulus, t/ max the maximum permissible 
displacement, x\ the area of cross section of bars 1 and 3, X 2 the area of cross section 
of bar 2, and the vertical displacement of node S is given by 


U i 


P\1 1 

E X 1 + \flx 2 


(E 3 ) 


Find the solution of the problem using the optimality criteria method. 


SOLUTION The partial derivatives of / and g required by Eqs. (12.68) and (12.71) 
can be computed as 


df_ 

dxi 


80.0445, 


dj_ 

dX2 


— 28.3 


dg 3 g dxi 3 g , 

— = — — = — —(—xf), i = 1,2 

3 n dxi dn 3 Xi 


3 g _ -1 dg _ ~\/2 

d*i ( Xl + sflxi) 2 dX 2 {x\+\/ 2 X 2 ) 2 


At any design X,, Eq. (12.70) gives 


go = g(X,) + 


dxi 


, dg 

x il + 7 — 

x ; 3 x 2 
- 1 - 


Xi 2 


X, 

Xil 


sfl Xj2 


x n + x/2 X i2 (Xjl + x i2 ) 2 (x { 1 + \fl x i2 ) 2 


Thus the values of X and (xi, x 2 ) can be determined iteratively using Eqs. (12.71) and 
(12.68). Starting from the initial design (xi,x 2 ) — (2.0, 2.0) in 2 , the results obtained 
are shown in Table 12.1. 
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REVIEW QUESTIONS 

12.1 Answer true or false: 

(a) Design variables of an optimal control problem include both state and control 
variables. 

(b) Reciprocal approximations consider reciprocals of member areas as design vari- 
ables. 

(c) Optimality criteria methods can be used for the optimization of nonlinear struc- 
tures with displacement constraints. 

(d) A variational operator is similar to a differential operator. 

(e) Calculus of variations can be used only for finding the extrema of functionals 
with no constraints. 

(f) Optimality criteria methods can be used to solve any optimization problem. 

12.2 Define the following terms: 

(a) Brachistochrone 

(b) State vector 

(c) Performance index 

(d) Adjoint equations 

(e) Transversality condition 

(f) Optimality criteria methods 
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(g) Functional 

(h) Hamiltonian 

12.3 Match the following terms and descriptions: 

Linear elastic structures 
Lagrange multipliers 
Necessary conditions of optimality 
Optimization of functionals 
Hamiltonian used 


(a) Adjoint variables 

(b) Optimality criteria methods 

(c) Calculus of variations 

(d) Optimal control theory 

(e) Governing equations 


12.4 What are the characteristics of a variational operator? 

12.5 What are Euler -Lagrange equations? 

12.6 Which method can be used to solve a trajectory optimization problem? 

12.7 What is an optimality criteria method? 

12.8 What is the basis of optimality criteria methods? 

12.9 What are the advantages of using reciprocal approximations in structural optimization? 

12.10 What is the difference between free and forced boundary conditions? 

12.11 What type of problems require introduction of Lagrange multipliers? 

12.12 Where are reciprocal approximations used? Why? 


PROBLEMS 


12.1 

12.2 

12.3 


Find the curve connecting two points A(0, 0) and B( 2, 0) such that the length of the 
line is a minimum and the area under the curve is n/2. 


Prove that the shortest distance between two points is a straight line. Show that the 
necessary conditions yield a minimum and not a maximum. 

Find the function x{l) that minimizes the functional 


A = 


f 


+ 2x? + , §) 


dt 


with the condition that x(0) = 2. 

12.4 Find the closed plane curve of length L that encloses a maximum area. 

12.5 The potential energy of an elastic circular annular plate of radii r\ and r 2 shown in 
Fig. 12.7 is given by 


jtq = ttD 


rn 
Jr 1 

— 277 f 
Jr\ 


/ d 2 u A" 1 / dw\ ^ dw d 2 w 


V dr 2 ) r \ dr 


dr dr 2 


dr 


qrw dr + 2n 


dw 

rM rQw 

dr 
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Figure 12.7 Circular annular plate under load, 
where D is the flexural rigidity of the plate, w the transverse deflection of the plate, v 
the Poisson’s ratio, M the radial bending moment per unit of circumferential length, and 
Q the radial shear force per unit of circumferential length. Find the differential equation 
and the boundary conditions to be satisfied by minimizing ttq. 

12.6 Consider the two-bar truss shown in Fig. 12.8. For the minimum-weight design of the 
truss with a bound on the horizontal displacement of node S, we need to solve the 
following problem: 

Find X = {*i jC 2 } T which minimizes 

/(X) = l(xi + x 2 ) = y/2 60(xi +* 2 ) 


subject to 


S(X) = 


PI 


2 E \xi 


1 1 

+ 


X2 


U lr 


= 1(T J (- + —)- 10“ z < 0 

,*1 x 2/ 


0.1 in. 2 < Xj < 1.0 in. 2 , 


i = 1,2 

Find the solution of the problem using the optimality criteria method. 

12.7 In the three-bar truss considered in Example 12.5 (Fig. 12.6), if the constraint is placed 
on the resultant displacement of node S, the optimization problem can be stated as 


Find X = 


subject to 


which minimizes 

/(X) = 80.0445xi + 28.3 x 2 


Ur + 1 / 2 = ^r 


1 

“"7 + 


1 


(xi + x 2 ) 2 . 


1/2 


< U n- 


g(X) = 


— T + 


X 2 (X1+V2X2) 2 . 


1/2 


< U, 


— '--'max 


where the vertical and horizontal displacements of node S are given by 

PI 1 PI 1 

Ui = — — and Un = 

P X] -p \J2 X2 P Xl 

Find the solution of the problem using the optimality criteria method. 
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l - 60" 

P= 1000 ib 
£ = 30x 106 psi 

Figure 12.8 Two-bar truss subjected to horizontal load. 


12.8 The problem of the minimum-weight design of the four-bar truss shown in Fig. 1.32 
(Problem 1.31) subject to a constraint on the vertical displacement of joint A and limi- 
tations on design variables can be stated as follows: 

Find X = {jci X 2 } T which minimizes 

/(X) = 0.1*i + 0.05773x2 


subject to 


*(X> 


0.6 

Xl 


0.3464 

0.1 <0 

*2 


Xi > 4, * = 1,2 


where the maximum permissible vertical displacement of joint A is assumed to be 0.01 in. 
Solve the problem using the optimality criteria method. 
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M odern M ethods of Optimization 


13.1 INTRODUCTION 

In recent years, some optimization methods that are conceptually different from the tra- 
ditional mathematical programming techniques have been developed. These methods 
are labeled as modern or nontraditional methods of optimization. Most of these meth- 
ods are based on certain characteristics and behavior of biological, molecular, swarm 
of insects, and neurobiological systems. The following methods are described in this 
chapter: 

1. Genetic algorithms 

2. Simulated annealing 

3. Particle swarm optimization 

4. Ant colony optimization 

5. Fuzzy optimization 

6 . Neural-network-based methods 

Most of these methods have been developed only in recent years and are emerging 
as popular methods for the solution of complex engineering problems. Most require 
only the function values (and not the derivatives). The genetic algorithms are based 
on the principles of natural genetics and natural selection. Simulated annealing is 
based on the simulation of thermal annealing of critically heated solids. Both genetic 
algorithms and simulated annealing are stochastic methods that can find the global 
minimum with a high probability and are naturally applicable for the solution of discrete 
optimization problems. The particle swarm optimization is based on the behavior of 
a colony of living things, such as a swarm of insects, a flock of birds, or a school of 
fish. Ant colony optimization is based on the cooperative behavior of real ant colonies, 
which are able to find the shortest path from their nest to a food source. In many 
practical systems, the objective function, constraints, and the design data are known 
only in vague and linguistic terms. Fuzzy optimization methods have been developed 
for solving such problems. In neural-network-based methods, the problem is modeled 
as a network consisting of several neurons, and the network is trained suitably to solve 
the optimization problem efficiently. 


Engineering Optimization: Theory and Practice, Fourth Edition Si ngi resu S. Rao 
Copyright © 2009 byj ohn Wi ley & Sons, I nc. 
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13.2 GENETIC ALGORITHMS 
13.2.1 Introduction 

Many practical optimum design problems are characterized by mixed continuous- 
discrete variables, and discontinuous and nonconvex design spaces. If standard nonlin- 
ear programming techniques are used for this type of problem they will be inefficient, 
computationally expensive, and, in most cases, find a relative optimum that is closest to 
the starting point. Genetic algorithms (GAs) are well suited for solving such problems, 
and in most cases they can find the global optimum solution with a high probability. 
Although GAs were first presented systematically by Holland [13.1], the basic ideas 
of analysis and design based on the concepts of biological evolution can be found in 
the work of Rechenberg [13.2]. Philosophically, GAs are based on Darwin’s theory of 
survival of the fittest. 

Genetic algorithms are based on the principles of natural genetics and natural 
selection. The basic elements of natural genetics — reproduction, crossover, and 
mutation — are used in the genetic search procedure. GAs differ from the traditional 
methods of optimization in the following respects: 

1. A population of points (trial design vectors) is used for starting the procedure 
instead of a single design point. If the number of design variables is n, usually 
the size of the population is taken as 2 n to An. Since several points are used as 
candidate solutions, GAs are less likely to get trapped at a local optimum. 

2. GAs use only the values of the objective function. The derivatives are not used 
in the search procedure. 

3. In GAs the design variables are represented as strings of binary variables that 
correspond to the chromosomes in natural genetics. Thus the search method 
is naturally applicable for solving discrete and integer programming problems. 
For continuous design variables, the string length can be varied to achieve any 
desired resolution. 

4. The objective function value corresponding to a design vector plays the role of 
fitness in natural genetics. 

5. In every new generation, a new set of strings is produced by using randomized 
parents selection and crossover from the old generation (old set of strings). 
Although randomized, GAs are not simple random search techniques. They 
efficiently explore the new combinations with the available knowledge to find 
a new generation with better fitness or objective function value. 


13.2.2 Representation of Design Variables 

In GAs, the design variables are represented as strings of binary numbers, 0 and 1 . For 
example, if a design variable x, is denoted by a string of length four (or a four-bit string) 
as 0 1 0 1, its integer (decimal equivalent) value will be (1) 2° + (0) 2 1 + (1) 2 2 + 
(0) 2 3 = 1 + 0 + 4 + 0 = 5. If each design variable x,-, i = 1, 2, . . . , n is coded in 
a string of length q, a design vector is represented using a string of total length nq. 
For example, if a string of length 5 is used to represent each variable, a total string 
of length 20 describes a design vector with n — 4. The following string of 20 binary 
digits denote the vector (xi = 18, *2 = 3, X 3 — 1, X 4 = 4): 
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|< ; String of length 20 ; H 

iooio;oooiijooooijooioo 

|4 — jq — — %2 — — Jq — ►'4 — Xu — »| 

In general, if a binary number is given by b q b q -\ ■ ■ ■ bibib^, where b^—O or 1, 
k = 0,1,2, ... ,q, then its equivalent decimal number y (integer) is given by 

y = Y J Z k bk (13-1) 

k = 0 


This indicates that a continuous design variable x can only be represented by a set 
of discrete values if binary representation is used. If a variable x (whose bounds are 
given by x' 1 ' and x (u> ) is represented by a string of q binary numbers, as shown in 
Eq. (13.1), its decimal value can be computed as 

r («) _ r Q) JL_ 

X = X</) + 2 g - X ) 2,<bk ( 13 -2) 

k = 0 


Thus if a continuous variable is to be represented with high accuracy, we need to use a 
large value of q in its binary representation. In fact, the number of binary digits needed 
(q) to represent a continuous variable in steps (accuracy) of Ax can be computed from 
the relation 


2 q > 


x <m) _ x (i) 
Ax 


+ 1 


(13.3) 


For example, if a continuous variable x with bounds 1 and 5 is to be represented with 
an accuracy of 0.01, we need to use a binary representation with q digits where 


2 q > 


5 - 1 
0.01 


+ 1 = 401 


or q — 9 


(13.4) 


Equation (13.2) shows why GAs are naturally suited for solving discrete optimization 
problems. 


Example 13.1 Steel plates are available in thicknesses (in inches) of 

iiiiiiiiiiiiniiiu 

32 ’ 16 ’ 32 ’ 8 ’ 32 ’ 16 ’ 32 ’ 4 ’ 32 ’ 16 ’ 32 ’ 8 ’ 32 ’ 16 ’ 32 ’ 2 

from a manufacturer. If the thickness of the steel plate, to be used in the construction 
of a pressure vessel, is considered as a discrete design variable, determine the size of 
the binary string to be used to select a thickness from the available values. 


SOLUTION The lower and upper bounds on the steel plate (design variable, x) are 
given by ^ and 1 in., respectively, and the resolution or difference between any two 
adjacent thicknesses is -^in. Equation (13.3) gives 


2 q > 


x (m) _ x (0 
Ax 


I in. — 4, in. 

+ l = 1 — - — — — + 1 = 15 


in. 


from which the size of the binary string to be used can be obtained as q — 4. 


696 Modem Methods of Optimization 

13.2.3 Representation of Objective Function and Constraints 

Because genetic algorithms are based on the survival-of-the-fittest principle of nature, 
they try to maximize a function called the fitness function. Thus GAs are naturally 
suitable for solving unconstrained maximization problems. The fitness function, F(X), 
can be taken to be same as the objective function /(X) of an unconstrained maximiza- 
tion problem so that F(X) = /(X). A minimization problem can be transformed into a 
maximization problem before applying the GAs. Usually the btness function is chosen 
to be nonnegative. The commonly used transformation to convert an unconstrained 
minimization problem to a fitness function is given by 

™ = Trk) <a5) 

It can be seen that Eq. (13.5) does not alter the location of the minimum of /(X) but 
converts the minimization problem into an equivalent maximization problem. 

A general constrained minimization problem can be stated as 

Minimize /(X) 


subject to 


g, (X) < 0, i = 1, 2, . . . , m (13.6) 

and 


h,(X) = 0, j = 1,2,...,/? 


This problem can be converted into an equivalent unconstrained minimization problem 
by using the concept of penalty function as 

m p 

Minimize 0(X) = /(X) + ( gi ( X )) 2 + £ Rj (^(X)) 2 (13.7) 

;= i j = i 


where r, and R ( are the penalty parameters associated with the constraints g,(X) and 
hj(X), whose values are usually kept constant throughout the solution process. In 
Eq. (13.7), the function (g,(X)), called the bracket function, is defined as 


<ft(X)> = 


{ 8i (X ) if g,(X)>0 
i o if g,(X) <0 


(13.8) 


In most cases, the penalty parameters associated with all the inequality and equality 
constraints are assumed to be the same constants as 


rj — r, i = 1, 2, . . . , m and Rj — R, j = 1, 2, . . . , p (13.9) 

where r and R are constants. The fitness function, F(X), to be maximized in the GAs 
can be obtained, similar to Eq. (13.5), as 

1 

1 +0(X) 


F(X) = 


(13.10) 
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Equations (13.7) and (13.8) show that the penalty will be proportional to the square of 
the amount of violation of the inequality and equality constraints at the design vector 
X, while there will be no penalty added to /(X) if all the constraints are satisfied at 
the design vector X . 


The solution of an optimization problem by GAs starts with a population of random 
strings denoting several (population of) design vectors. The population size in GAs ( n ) 
is usually fixed. Each string (or design vector) is evaluated to find its fitness value. 
The population (of designs) is operated by three operators — reproduction, crossover, 
and mutation — to produce a new population of points (designs). The new population is 
further evaluated to find the fitness values and tested for the convergence of the process. 
One cycle of reproduction, crossover, and mutation and the evaluation of the fitness 
values is known as a generation in GAs. If the convergence criterion is not satisfied, 
the population is iteratively operated by the three operators and the resulting new pop- 
ulation is evaluated for the fitness values. The procedure is continued through several 
generations until the convergence criterion is satisfied and the process is terminated. 
The details of the three operations of GAs are given below. 

Reproduction. Reproduction is the first operation applied to the population to select 
good strings (designs) of the population to form a mating pool. The reproduction 
operator is also called the selection operator because it selects good strings of the 
population. The reproduction operator is used to pick above-average strings from the 
current population and insert their multiple copies in the mating pool based on a 
probabilistic procedure. In a commonly used reproduction operator, a string is selected 
from the mating pool with a probability proportional to its fitness. Thus if F, denotes 
the fitness of the / th string in the population of size n, the probability for selecting the 
ith string for the mating pool ( p , ) is given by 


Note that Eq. (13.11) implies that the sum of the probabilities of the strings of the pop- 
ulation being selected for the mating pool is one. The implementation of the selection 
process given by Eq. (13.11) can be understood by imagining a roulette wheel with its 
circumference divided into segments, one for each string of the population, with the 
segment lengths proportional to the fitness of the strings as shown in Fig. 13.1. By 
spinning the roulette wheel n times (n being the population size) and selecting, each 
time, the string chosen by the roulette-wheel pointer, we obtain a mating pool of size 
n. Since the segments of the circumference of the wheel are marked according to the 
fitness of the various strings of the original population, the roulette-wheel process is 
expected to select F,/F copies of the ith string for the mating pool, where F denotes 
the average fitness of the population: 


13.2.4 Genetic Operators 



(13.11) 



(13.12) 
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Roulette wheel 



In Fig. 13.1, the population size is assumed to be 6 with fitness values of the strings 
1, 2, 3, 4, 5, and 6 given by 12, 4, 16, 8, 36, and 24, respectively. Since the fifth 
string (individual) has the highest value, it is expected to be selected most of the time 
(36% of the time, probabilistically) when the roulette wheel is spun n times (n = 6 in 
Fig. 13.1). The selection scheme, based on the spinning of the roulette wheel, can be 
implemented numerically during computations as follows. 

The probabilities of selecting different strings based on their fitness values are 
calculated using Eq. (13.11). These probabilities are used to determine the cumulative 
probability of string i being copied to the mating pool, P,, by adding the individual 
probabilities of strings 1 through i as 

i 

Pi = Zpj (13 - 13) 

j = 1 

Thus the roulette-wheel selection process can be implemented by associating the cumu- 
lative probability range (P,_i — P,) to the / th string. To generate the mating pool of 


Table 13.1 Roulette- Wheel Selection Process for Obtaining the Mating Pool 




Probability of 

Cumulative 

Range of 



selecting string i 

probability value 

cumulative 



for the mating 

of string i. 

probability of 

String number i 

Fitness value P ; 

pool, Pi 

p i = E Pi 
i = i 

string /, (P,-i, Pi) 

1 

12 

0.12 

0.12 

0.00-0.12 

2 

4 

0.04 

0.16 

0.12-0.16 

3 

16 

0.16 

0.32 

0.16-0.32 

4 

8 

0.08 

0.40 

0.32-0.40 

5 

36 

0.36 

0.76 

0.40-0.76 

6 

24 

0.24 

1.00 

0.76-1.00 
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size n during numerical computations, n random numbers, each in the range of zero 
to one, are generated (or chosen). By treating each random number as the cumulative 
probability of the string to be copied to the mating pool, n strings corresponding to the n 
random numbers are selected as members of the mating pool. By this process, the string 
with a higher (lower) fitness value will be selected more (less) frequently to the mating 
pool because it has a larger (smaller) range of cumulative probability. Thus strings with 
high fitness values in the population, probabilistically, get more copies in the mating 
pool. It is to be noted that no new strings are formed in the reproduction stage; only 
the existing strings in the population get copied to the mating pool. The reproduction 
stage ensures that highly fit individuals (strings) live and reproduce, and less fit indi- 
viduals (strings) die. Thus the GAs simulate the principle of “survival-of-the-fittest” 
of nature. 

Example 13.2 Consider six strings with fitness values 12, 4, 16, 8, 36, and 24 with 
the corresponding roulette wheel as shown in Fig. 13.1. Find the levels of contribution 
of the various strings to the mating pool using the roulette-wheel selection process with 
the following 12 random numbers: 0.41, 0.65, 0.42, 0.80, 0.67, 0.39, 0.63, 0.53, 0.86, 
0.88, 0.75, 0.55. 

Note: (1) These random numbers are taken from Ref. [13.20]. (2) Although the original 
population consists of only 6 strings, the mating pool is assumed to be composed of 
12 strings to illustrate the roulette-wheel selection process. 

SOLUTION If the given random numbers are assumed to represent cumulative 
probabilities, the string numbers to be copied to the mating pool can be determined 
from the cumulative probability ranges listed in the last column of Table 13.1 as 
follows: 


Random number 0.41 0.65 0.42 0.80 0.67 0.39 0.63 0.53 0.86 0.88 0.75 0.55 

(cumulative probability of 
the string to be copied) 

String number to be copied 5 5565455665 5 

to the mating pool 


This indicates that the mating pool consists of 1 copy of string 4, 8 copies of string 5, 
and 3 copies of string 6. This shows that less fit individuals (strings 1, 2, and 3) did 
not contribute to the next generation (or died) because they could not contribute to the 
mating pool. String 4, although has a small fitness value, contributed 1 copy to the 
mating pool based on the random selection process used. 


Crossover. After reproduction, the crossover operator is implemented. The purpose 
of crossover is to create new strings by exchanging information among strings of the 
mating pool. Many crossover operators have been used in the literature of GAs. In most 
crossover operators, two individual strings (designs) are picked (or selected) at random 
from the mating pool generated by the reproduction operator and some portions of the 
strings are exchanged between the strings. In the commonly used process, known as a 
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single-point crossover operator, a crossover site is selected at random along the string 
length, and the binary digits (alleles) lying on the right side of the crossover site are 
swapped (exchanged) between the two strings. The two strings selected for participation 
in the crossover operators are known as parent strings and the strings generated by the 
crossover operator are known as child strings. 

For example, if two design vectors (parents), each with a string length of 10, are 
given by 


(Parent 1) Xj = {0 1 0 J 1 0 1 1 0 1 1} 

(Parent 2) X 2 = {1 0 0 | 0 1 1 1 1 0 0} 

the result of crossover, when the crossover site is 3, is given by 

(Offspring 1) X 3 = {0 1 0 | 0 1 1 1 10 0} 

(Offspring 2) X 4 = {1 0 0 | 1 0 1 10 1 1} 

Since the crossover operator combines substrings from parent strings (which have 
good fitness values), the resulting child strings created are expected to have better 
fitness values provided an appropriate (suitable) crossover site is selected. Flowever, 
the suitable or appropriate crossover site is not known before hand. Hence the crossover 
site is usually chosen randomly. The child strings generated using a random crossover 
site may or may not be as good or better than their parent strings in terms of their 
fitness values. If they are good or better than their parents, they will contribute to a 
faster improvement of the average fitness value of the new population. On the other 
hand, if the child strings created are worse than their parent strings, it should not be of 
much concern to the success of the GAs because the bad child strings will not survive 
very long as they are less likely to be selected in the next reproduction stage (because 
of the survival-of-the-fittest strategy used). 

As indicated above, the effect of crossover may be useful or detrimental. Hence it 
is desirable not to use all the strings of the mating pool in crossover but to preserve 
some of the good strings of the mating pool as part of the population in the next 
generation. In practice, a crossover probability, p c , is used in selecting the parents for 
crossover. Thus only 100 p c percent of the strings in the mating pool will be used in 
the crossover operator while 100 (1 — p c ) percent of the strings will be retained as 
they are in the new generation (of population). 

M utation. The crossover is the main operator by which new strings with better fitness 
values are created for the new generations. The mutation operator is applied to the new 
strings with a specific small mutation probability, p m . The mutation operator changes 
the binary digit (allele’s value) 1 to 0 and vice versa. Several methods can be used 
for implementing the mutation operator. In the single-point mutation, a mutation site 
is selected at random along the string length and the binary digit at that site is then 
changed from 1 to 0 or 0 to 1 with a probability of p m . In the bit-wise mutation, each 
bit (binary digit) in the string is considered one at a time in sequence, and the digit 
is changed from 1 to 0 or 0 to 1 with a probability p m . Numerically, the process can 
be implemented as follows. A random number between 0 and 1 is generated/chosen. 
If the random number is smaller than p m , then the binary digit is changed. Otherwise, 
the binary digit is not changed. 
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The purpose of mutation is (1) to generate a string (design point) in the neigh- 
borhood of the current string, thereby accomplishing a local search around the current 
solution, (2) to safeguard against a premature loss of important genetic material at a 
particular position, and (3) to maintain diversity in the population. 

As an example, consider the following population of size n = 5 with a string 
length 10: 

10001 00011 
10111 10100 
11000 01101 
10110 10010 
11100 01001 

Here all the five strings have a 1 in the position of the first bit. The true optimum 
solution of the problem requires a 0 as the first bit. The required 0 cannot be created 
by either the reproduction or the crossover operators. However, when the mutation 
operator is used, the binary number will be changed from 1 to 0 in the location of the 
first bit with a probability of np m . 

Note that the three operators — reproduction, crossover, and mutation — are simple 
to implement. The reproduction operator selects good strings for the mating pool, the 
crossover operator recombines the substrings of good strings of the mating pool to 
create strings (next generation of population), and the mutation operator alters the 
string locally. The use of these three operators successively yields new generations 
with improved values of average fitness of the population. Although, the improvement 
of the fitness of the strings in successive generations cannot be proved mathematically, 
the process has been found to converge to the optimum fitness value of the objective 
function. Note that if any bad strings are created at any stage in the process, they will 
be eliminated by the reproduction operator in the next generation. The GAs have been 
successfully used to solve a variety of optimization problems in the literature. 


13.2.5 Algorithm 

The computational procedure involved in maximizing the fitness function F(x i, 
X 2 , x^, , x„) in the genetic algorithm can be described by the following steps. 

1. Choose a suitable string length l — nq to represent the n design variables of 
the design vector X. Assume suitable values for the following parameters: pop- 
ulation size m, crossover probability p c , mutation probability p m , permissible 
value of standard deviation of fitness values of the population (s/) m ax to use as 
a convergence criterion, and maximum number of generations (z' max ) to be used 
an a second convergence criterion. 

2. Generate a random population of size m, each consisting of a string of length 
I — nq. Evaluate the fitness values F it i — 1,2,..., m, of the m strings. 

3. Carry out the reproduction process. 

4. Carry out the crossover operation using the crossover probability p c . 
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5 . Carry out the mutation operation using the mutation probability p,„ to find the 
new generation of m strings. 

6. Evaluate the fitness values Fj, i = 1, 2, . . . , m, of the m strings of the new 
population. Find the standard deviation of the m fitness values. 

7. Test for the convergence of the algorithm or process. If s/ < ( s/) m ax , the con- 
vergence criterion is satisfied and hence the process may be stopped. Otherwise, 
go to step 8. 

8. Test for the generation number. If i > / max , the computations have been per- 
formed for the maximum permissible number of generations and hence the 
process may be stopped. Otherwise, set the generation number as i = i + 1 and 
go to step 3. 

13.2.6 Numerical Results 

The welded beam problem described in Section 7.22.3 (Fig. 7.23) was considered by 
Deb [13.20] with the following data: population size = 100, total string length = 40, 
substring length for each design variable = 10, probability of crossover = 0.9, and prob- 
ability of mutation = 0.01. Different penalty parameters were considered for different 
constraints in order to have the contribution of each constraint violation to the objec- 
tive function be approximately the same. Nearly optimal solutions were obtained after 
only about 15 generations with approximately 0.9 x 100 x 15 = 1350 function evalua- 
tions. The optimum solution was found to be x* — 0.2489, x| = 6.1730, x* = 8.1789, 
x\ — 0.2533, and /* = 2.43, which can be compared with the solution obtained from 
geometric programming, x* = 0.2455, x| = 6.1960, x| = 8.2730, x\ = 0.2455, and 
/* = 2.39 [13.21]. Although the optimum solution given by the GAs corresponds to 
a slightly larger value of /*, it satisfies all the constraints (the solution obtained from 
geometric programming violates three constraints slightly). 


13.3 SIMULATED ANNEALING 
13.3.1 Introduction 

The simulated annealing method is based on the simulation of thermal annealing of 
critically heated solids. When a solid (metal) is brought into a molten state by heating 
it to a high temperature, the atoms in the molten metal move freely with respect to each 
other. However, the movements of atoms get restricted as the temperature is reduced. 
As the temperature reduces, the atoms tend to get ordered and finally form crystals 
having the minimum possible internal energy. The process of formation of crystals 
essentially depends on the cooling rate. When the temperature of the molten metal is 
reduced at a very fast rate, it may not be able to achieve the crystalline state; instead, 
it may attain a polycrystalline state having a higher energy state compared to that of 
the crystalline state. In engineering applications, rapid cooling may introduce defects 
inside the material. Thus the temperature of the heated solid (molten metal) needs to 
be reduced at a slow and controlled rate to ensure proper solidification with a highly 
ordered crystalline state that corresponds to the lowest energy state (internal energy). 
This process of cooling at a slow rate is known as annealing. 
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13.3.2 Procedure 

The simulated annealing method simulates the process of slow cooling of molten 
metal to achieve the minimum function value in a minimization problem. The cooling 
phenomenon of the molten metal is simulated by introducing a temperature -like param- 
eter and controlling it using the concept of Boltzmann’s probability distribution. The 
Boltzmann’s probability distribution implies that the energy ( E ) of a system in thermal 
equilibrium at temperature T is distributed probabilistically according to the relation 

P(E) = e~ E/kT (13.14) 

where P{E) denotes the probability of achieving the energy level E, and k is called the 
Boltzmann’s constant. Equation (13.14) shows that at high temperatures the system has 
nearly a uniform probability of being at any energy state; however, at low temperatures, 
the system has a small probability of being at a high-energy state. This indicates that 
when the search process is assumed to follow Boltzmann’s probability distribution, the 
convergence of the simulated annealing algorithm can be controlled by controlling the 
temperature T . The method of implementing the Boltzmann’s probability distribution 
in simulated thermodynamic systems, suggested by Metropolis et al. [13.37], can also 
be used in the context of minimization of functions. 

In the case of function minimization, let the current design point (state) be X,, 
with the corresponding value of the objective function given by /, = /(X,). Similar 
to the energy state of a thermodynamic system, the energy E-, at state X, is given by 

Et = fi = f&i) (13.15) 

Then, according to the Metropolis criterion, the probability of the next design point 
(state) X,- + i depends on the difference in the energy state or function values at the two 
design points (states) given by 

A E = E i+1 - E f = A / = f i+l -fi= /(X ;+1 ) - /(X,-) (13.16) 

The new state or design point X,-+i can be found using the Boltzmann’s probability 
distribution: 


P[E i+ j] = min { 1 , e ~ AE ^ kT } (13.17) 

The Boltzmann’s constant serves as a scaling factor in simulated annealing and, as such, 
can be chosen as 1 for simplicity. Note that if A E < 0, Eq. (13.17) gives P[E i+ \] = 1 
and hence the point X, + i is always accepted. This is a logical choice in the context of 
minimization of a function because the function value at X ;+ i, J]+\, is better (smaller) 
than at X,, /,-, and hence the design vector X, + i must be accepted. On the other 
hand, when A E > 0, the function value f- l+ \ at X, + i is worse (larger) than the one at 
X,. According to most conventional optimization procedures, the point X, + i cannot be 
accepted as the next point in the iterative process. However, the probability of accepting 
the point X !+ i, in spite of its being worse than X, in terms of the objective function 
value, is finite (although it may be small) according to the Metropolis criterion. Note 
that the probability of accepting the point X, + i 

P[E i+ 1] - {e-^ E ' kT \ 


(13.18) 
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is not same in all situations. As can be seen from Eq. (13.18), this probability depends 
on the values of AE and T. If the temperature T is large, the probability will be high 
for design points X,-+i with larger function values (with larger values of A E — A/). 
Thus at high temperatures, even worse design points X,- + i are likely to be accepted 
because of larger probabilities. However, if the temperature T is small, the probability 
of accepting worse design points X,-+i (with larger values of A E = A/) will be small. 
Thus as the temperature values get smaller (that is, as the process gets closer to the 
optimum solution), the design points X, + i with larger function values compared to the 
one at X, are less likely to be accepted. 

13.3.3 Algorithm 

The SA algorithm can be summarized as follows. Start with an initial design vector 
X i (iteration number i — 1) and a high value of temperature T. Generate a new design 
point randomly in the vicinity of the current design point and find the difference in 
function values: 


AE = A/ = f i+ i -fi = /(X i+1 ) - /(X,) (13.19) 

If fi + 1 is smaller than j\ (with a negative value of A/), accept the point X,-+i as 
the next design point. Otherwise, when A / is positive, accept the point X, + i as the 
next design point only with a probability e~ Al:/kT . This means that if the value of a 
randomly generated number is larger than e~ AE ^ kT , accept the point X,- + i; otherwise, 
reject the point X, + ]. This completes one iteration of the SA algorithm. If the point 
X (+ i is rejected, then the process of generating a new design point X, + i randomly in 
the vicinity of the current design point, evaluating the corresponding objective function 
value fi- |-i, and deciding to accept X,-+i as the new design point, based on the use 
of the Metropolis criterion, Eq. (13.18), is continued. To simulate the attainment of 
thermal equilibrium at every temperature, a predetermined number (n) of new points 
X,- + i are tested at any specific value of the temperature T. 

Once the number of new design points X, + i tested at any temperature T exceeds the 
value of n, the temperature T is reduced by a prespecified fractional value c (0 < c < 1) 
and the whole process is repeated. The procedure is assumed to have converged when 
the current value of temperature T is sufficiently small or when changes in the function 
values (A/) are observed to be sufficiently small. 

The choices of the initial temperature T, the number of iterations n before reduc- 
ing the temperature, and the temperature reduction factor c play important roles in the 
successful convergence of the SA algorithm. For example, if the initial temperature 
T is too large, it requires a larger number of temperature reductions for convergence. 
On the other hand, if the initial temperature is chosen to be too small, the search 
process may be incomplete in the sense that it might fail to thoroughly investigate 
the design space in locating the global minimum before convergence. The tempera- 
ture reduction factor c has a similar effect. Too large a value of c (such as 0.8 or 
0.9) requires too much computational effort for convergence. On the other hand, too 
small a value of c (such as 0.1 or 0.2) may result in a faster reduction in tempera- 
ture that might not permit a thorough exploration of the design space for locating the 
global minimum solution. Similarly, a large value of the number of iterations n will 
help in achieving quasiequilibrium state at each temperature but will result in a larger 
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computational effort. A smaller value of n, on the other hand, might result either in a 
premature convergence or convergence to a local minimum (due to inadequate explo- 
ration of the design space for the global minimum). Unfortunately, no unique set of 
values are available for T, n, and c that will work well for every problem. However, 
certain guidelines can be given for selecting these values. The initial temperature T 
can be chosen as the average value of the objective function computed at a number 
of randomly selected points in the design space. The number of iterations n can be 
chosen between 50 and 100 based on the computing resources and the desired accu- 
racy of solution. The temperature reduction factor c can be chosen between 0.4 and 

0.6 for a reasonable temperature reduction strategy (also termed the cooling schedule). 
More complex cooling schedules, based on the expected mathematical convergence 
rates, have been used in the literature for the solution of complex practical optimiza- 
tion problems [13.19]. In spite of all the research being done on SA algorithms, the 
choice of the initial temperature T, the number of iterations n at any specific tem- 
perature, and the temperature reduction factor (or cooling rate) c still remain an art 
and generally require a trial-and-error process to find suitable values for solving any 
particular type of optimization problems. The SA procedure is shown as a flowchart 
in Fig. 13.2. 


13.3.4 F eatures of the M ethod 

Some of the features of simulated annealing are as follows: 

1. The quality of the final solution is not affected by the initial guesses, except 
that the computational effort may increase with worse starting designs. 

2. Because of the discrete nature of the function and constraint evaluations, the 
convergence or transition characteristics are not affected by the continuity or 
differentiability of the functions. 

3. The convergence is also not influenced by the convexity status of the feasible 
space. 

4. The design variables need not be positive. 

5. The method can be used to solve mixed-integer, discrete, or continuous 
problems. 

6 . For problems involving behavior constraints (in addition to lower and upper 
bounds on the design variables), an equivalent unconstrained function is to be 
formulated as in the case of genetic algorithms. 

13.3.5 Numerical Results 

The welded beam problem of Section 7.22.3 (Fig. 7.23) is solved using simu- 
lated annealing. The solution is given by xf — 0.2471, x\ — 6.1451, x% = 8.2721, 
x\ — 0.2495, and /* = 2.4148. This solution can be compared with the solutions 
obtained by genetic algorithms ( x * = 0.2489, X 2 = 6.1730, x% — 8.1789, x* A = 0.2533, 
and f* — 2.4331) and geometric programming (x* = 0.2536, x| = 7.1410, x| = 
7.1044, x\ — 0.2536, and /* = 2.3398). Notice that the solution given by geometric 
programming [13.21] violated three constraints slightly, while the solutions given by 
the genetic algorithms [13.20] and simulated annealing satisfied all the constraints. 
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Figure 13.2 Simulated annealing procedure. 


£ xample 13.3 Find the minimum of the following function using simulated annealing 
/(X) = 500 — 20xi — 26x2 — 4 xiX 2 + 4xj + 3x| 


SOLUTION We follow the procedure indicated in the flowchart of Fig. 13.2. 
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Step 1 


Step 2 
Step 3 


Step 4 


Step 3 


Choose the parameters of the SA method. The initial temperature is taken 
as the average value of / evaluated at four randomly selected points in the 
design space. By selecting the random points as X (1) = {"}, X (2) = { 5 }, 
X ( 3 ) = {j}, X <4) = we find the corresponding values of the objective 
function as / (1) = 476, / (2) = 340, / (3) =381, f (4) — 340, respectively. Not- 
ing that the average value of the objective functions f (1) , f <2) , / (3) , and f <4) 
is 384.25, we assume the initial temperature to be T — 384.25. The tempera- 
ture reduction factor is chosen as c = 0.5. To make the computations brief, we 
choose the maximum permissible number of iterations (at any specific value 
of temperature) as n = 2. We select the initial design point as X i = j 4 }. 
Evaluate the objective function value at Xi as f\ — 349.0 and set the iteration 
number as i — 1. 

Generate a new design point in the vicinity of the current design point. For 
this, we select two uniformly distributed random numbers u \ and m 2 ; ni for 
x i in the vicinity of 4 and m 2 for X 2 in the vicinity of 5. The numbers u\ and 
U 2 are chosen as 0.31 and 0.57, respectively. By choosing the ranges of x\ 
and X 2 as (—2, 10) and (—1, 11), which represent ranges of ±6 about their 
respective current values, the uniformly distributed random numbers r\ and r 2 
in the ranges of x\ and X 2 , corresponding to u\ and m 2 , can be found as 

n = -2 + ni{10 - (-2)} = -2 + 0.31(12) = 1.72 

r 2 = — 1 + m 2 {11 - (-1)} = -1 +0.57(12) =5.84 


which gives X 2 



( 1.72 
15.84 ' 


Since the objective function value / 2 = /(X 2 ) = 387.7312, the value of A / 
is given by 


A/ — f 2 — f\ — 387.7312 - 349.0 = 38.7312 

Since the value of A / is positive, we use the Metropolis criterion to decide 
whether to accept or reject the current point. For this we choose a random 
number in the range (0, 1) as r = 0.83. Equation (13.18) gives the probability 
of accepting the new design point X 2 as 

P[X 2 ] = e~ Af ' kT (EO 

By assuming the value of the Boltzmann’s constant k to be 1 for simplicity in 
Eq. (Ei), we obtain 

P[X 2 ] = e~ Af/kT = ^-38.7312/384.25 = Q 9041 

Since r — 0.83 is smaller than 0.9041, we accept the point X 2 = { 1 } as the 
next design point. Note that, although the objective function value / 2 is larger 
than / 1 , we accept X 2 because this is an early stage of simulation and the 
current temperature is high. 

Update the iteration number as i = 2. Since the iteration number i is less than 
or equal to n, we proceed to step 3. 
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Step 3: Generate a new design point in the vicinity of the current design point X 2 = 
{5 84 }- F° r l ^’ s ’ we choose the range of each design variable as ±6 about 
its current value so that the ranges are given by (—6+ 1.72,6+ 1.72) = 
(-4.28,7.72) for x, and (-6 + 5.84,6 + 5.84) = (-0.16, 11.84) for x 2 . By 
selecting two uniformly distributed random numbers in the range (0, 1) as 
u\ — 0.92 and u 2 — 0.73, the corresponding uniformly distributed random 
numbers in the ranges of x\ and x 2 become 

r\ = -4.28 + u\{l .12 - (-4.28)} = -4.28 + 0.92(12) = 6.76 

r 2 = —0.16 + u 2 {11.84 — (—0.16)} = -0.16 + 0.73(12) = 8.60 

which gives X 3 = {^*} = {g'gg} w * t * 1 a function value of fy — 313.3264. 
We note that the function value f 2 is better than f 2 with A f — f 2 — f 2 — 
313.3264 - 387.7312 = -74.4048. 

Step 4: Since A / < 0, we accept the current point as X 3 and increase the iteration 
number to i — 3. Since i > n, we go to step 5. 

Step 5: Since a cycle of iterations with the current value of temperature is completed, 
we reduce the temperature to a new value of T = 0.5 (384.25) = 192.125. 
Reset the current iteration number as i — 1 and go to step 3. 

Step 3: Generate a new design point in the vicinity of the current design point X 3 and 
continue the procedure until the temperature is reduced to a small value (until 
convergence). 


13.4 PARTICLE SWARM OPTIMIZATION 
13.4.1 Introduction 

Particle swarm optimization, abbreviated as PSO, is based on the behavior of a colony 
or swarm of insects, such as ants, termites, bees, and wasps; a flock of birds; or a 
school of fish. The particle swarm optimization algorithm mimics the behavior of these 
social organisms. The word particle denotes, for example, a bee in a colony or a 
bird in a flock. Each individual or particle in a swarm behaves in a distributed way 
using its own intelligence and the collective or group intelligence of the swarm. As 
such, if one particle discovers a good path to food, the rest of the swarm will also be 
able to follow the good path instantly even if their location is far away in the swarm. 
Optimization methods based on swarm intelligence are called behaviorally inspired 
algorithms as opposed to the genetic algorithms, which are called evolution-based 
procedures. The PSO algorithm was originally proposed by Kennedy and Eberhart in 
1995 [13.34], 

In the context of multivariable optimization, the swarm is assumed to be of specified 
or fixed size with each particle located initially at random locations in the multidimen- 
sional design space. Each particle is assumed to have two characteristics: a position 
and a velocity. Each particle wanders around in the design space and remembers the 
best position (in terms of the food source or objective function value) it has discov- 
ered. The particles communicate information or good positions to each other and adjust 
their individual positions and velocities based on the information received on the good 
positions. 
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As an example, consider the behavior of birds in a flock. Although each bird has 
a limited intelligence by itself, it follows the following simple rules: 

1. It tries not to come too close to other birds. 

2. It steers toward the average direction of other birds. 

3. It tries to fit the “average position” between other birds with no wide gaps in 
the flock. 

Thus the behavior of the flock or swarm is based on a combination of three simple 
factors: 

1. Cohesion — stick together. 

2. Separation — don’t come too close. 

3. Alignment — follow the general heading of the flock. 

The PSO is developed based on the following model: 

1. When one bird locates a target or food (or maximum of the objective function), 
it instantaneously transmits the information to all other birds. 

2. All other birds gravitate to the target or food (or maximum of the objective 
function), but not directly. 

3. There is a component of each bird’s own independent thinking as well as its 
past memory. 

Thus the model simulates a random search in the design space for the maximum value 
of the objective function. As such, gradually over many iterations, the birds go to the 
target (or maximum of the objective function). 

13.4.2 Computational Implementation of PSO 

Consider an unconstrained maximization problem: 

Maximize /(X) 

with X (,) < X < X (M) (13.20) 

where X (Z) and X <u> denote the lower and upper bounds on X, respectively. The PSO 
procedure can be implemented through the following steps. 

1. Assume the size of the swarm (number of particles) is N. To reduce the total 
number of function evaluations needed to find a solution, we must assume a 
smaller size of the swarm. But with too small a swarm size it is likely to take 
us longer to find a solution or, in some cases, we may not be able to find a 
solution at all. Usually a size of 20 to 30 particles is assumed for the swarm as 
a compromise. 

2. Generate the initial population of X in the range X (/) and X (u) randomly as 
Xi, X 2 , . . . , Xjv. Hereafter, for convenience, the particle (position of) j and 
its velocity in iteration i are denoted as X 'j 1 2 and V , , respectively. Thus the 
particles generated initially are denoted X 1 (0) , Xt(0), .... X y(0). The vectors 
Xj(0)(j — 1,2, ... , N) are called particles or vectors of coordinates of particles 
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(similar to chromosomes in genetic algorithms). Evaluate the objective function 
values corresponding to the particles as /[Xi(0)], /[Xt( 0)], . . . , /' [ X y(0)J. 

3. Find the velocities of particles. All particles will be moving to the optimal point 
with a velocity. Initially, all particle velocities are assumed to be zero. Set the 
iteration number as / — 1 . 

4. In the /th iteration, find the following two important parameters used by a 
typical particle j: 

(a) The historical best value of X 7 -(i) (coordinates of / th particle in the cur- 
rent iteration /), Pb es t /, with the highest value of the objective function, 
/[X j (*")], encountered by particle j in all the previous iterations. 

The historical best value of X ; (/) (coordinates of all particles up to that 
iteration), G best, with the highest value of the objective function /[X /(/)], 
encountered in all the previous iterations by any of the N particles. 

(b) Find the velocity of particle j in the /th iteration as follows: 

V j(i) = V j (/ - 1) + cm [Pbestj - X j (Z - 1)] 

+ c 2 r 2 [G be st — X j(i — 1)]; j = 1, 2, . . . , N (13.21) 

where c\ and C 2 are the cognitive (individual) and social (group) learning 
rates, respectively, and r\ and r? are uniformly distributed random numbers 
in the range 0 and 1 . The parameters c\ and c 2 denote the relative importance 
of the memory (position) of the particle itself to the memory (position) of 
the swarm. The values of c\ and C 2 are usually assumed to be 2 so that 
cm and cnri ensure that the particles would overfly the target about half 
the time. 

(c) Find the position or coordinate of the /th particle in /th iteration as 

X y- (/) = X y (/ — 1) + V 7 (/); ; = 1,2, ..., N (13.22) 

where a time step of unity is assumed in the velocity term in Eq. (13.22). 
Evaluate the objective function values corresponding to the particles as 
/[X !(/)], T[X 2 (/)],...,T[Xjv(/)]. 

5. Check the convergence of the current solution. If the positions of all particles 
converge to the same set of values, the method is assumed to have converged. 
If the convergence criterion is not satisfied, step 4 is repeated by updating the 
iteration number as / = / + 1, and by computing the new values of Pbest,/ and 
G best* The iterative process is continued until all particles converge to the same 
optimum solution. 


13.4.3 I mprovement to the Particle Swarm 0 ptimization M ethod 

It is found that usually the particle velocities build up too fast and the maximum of the 
objective function is skipped. Hence an inertia term, 6, is added to reduce the velocity. 
Usually, the value of 0 is assumed to vary linearly from 0.9 to 0.4 as the iterative process 
progresses. The velocity of the / th particle, with the inertia term, is assumed as 
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V;(0 = OVjii - 1) + CintPbest,^ - X j(i - 1)] 

+ c 2 r 2 [ Gbest - x y - (/ - 1)]; j = l, 2,..., N (13.23) 

The inertia weight 9 was originally introduced by Shi and Eberhart in 1999 [13.36] to 
dampen the velocities over time (or iterations), enabling the swarm to converge more 
accurately and efficiently compared to the original PSO algorthm with Eq. (13.21). 
Equation (13.23) denotes an adapting velocity formulation, which improves its fine 
tuning ability in solution search. Equation (13.23) shows that a larger value of 6 pro- 
motes global exploration and a smaller value promotes a local search. Thus a large value 
of 9 makes the algorithm constantly explore new areas without much local search and 
hence fails to find the true optimum. To achieve a balance between global and local 
exploration to speed up convergence to the true optimum, an inertia weight whose 
value decreases linearly with the iteration number has been used: 

e{i)=6 mm -( e ™r emm )i (13.24) 

where 0 max and 0 m ; n are the initial and final values of the inertia weight, respectively, 
and / max is the maximum number of iterations used in PSO. The values of 0 max = 0.9 
and 9 m [ n = 0.4 are commonly used. 


13.4.4 Solution of the Constrained Optimization Problem 

Let the constrained optimization problem be given by 

Maximize /(X) 

subject to (13.25) 

g/(X) <0; j = 1,2, ...,m 

An equivalent unconstrained function, F(X), is constructed by using a penalty function 
for the constraints. Two types of penalty functions can be used in defining the function 
F(X). The first type, known as the stationary penalty function, uses fixed penalty param- 
eters throughout the minimization and the penalty value depends only on the degree of 
violation of the constraints. The second type, known as nonstationary penalty function, 
uses penalty parameters whose values change dynamically with the iteration number 
during optimization. The results obtained with the nonstationary penalty functions have 
been found to be superior to those obtained with stationary penalty functions in the 
numerical studies reported in the literature. As such, the nonstationary penalty function 
is to be used in practical computations. 

According to the nonstationary penalty function approach, the function F(X) is 
defined as 


F(X) = f(X) + C(i)H(X) (13.26) 

where C{i) denotes a dynamically modified penalty parameter that varies with the iter- 
ation number i, and H(X) represents the penalty factor associated with the constraints: 
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C(t') = (ci'f (13.27) 

m 

H(X) = £ { (X )] [zjr , (X ) ] ^ (X )] } (13.28) 

f=i 

(p [ qj (X)]=a^l- -^j+b (13.29) 

c/jCX.) = max {0, g ; (X)} ; j = l,2,...,m (13.30) 


where c, a , a, and b are constants. Note that the function t/ ; (X) denotes the magnitude 
of violation of the j th constraint, <p[q J ('A)\ indicates a continuous assignment function, 
assumed to be of exponential form, as shown in Eq. (13.29), and y[g ; (X)] represents 
the power of the violated function. The values of c = 0.5, a — 2, a = 150, and b = 10 
along with 

^<x )i = {2 if *4x)>! < i33i > 

were used by Liu and Lin [13.35]. 

E xample 13.4 Find the maximum of the function 

f(x) — —x 2 + 2x + 1 1 

in the range — 2 < x < 2 using the PSO method. Use 4 particles ( N — 4) with the initial 
positions x\ = — 1.5, X 2 = 0.0, X 3 — 0.5, and X 4 — 1.25. Show the detailed computa- 
tions for iterations 1 and 2. 

SOLUTION 

1. Choose the number of particles N as 4. 

2. The initial population, chosen randomly (given as data), can be represented 
as xi(0) = —1.5, X 2 (0) = 0.0, X3(0) = 0.5, and *4(0) = 1.25. Evaluate the 
objective function values at current xj (0) , j — 1 , 2. 3, 4 as j\ = f[x\ ( 0 ) J = 
/(— L5) = 5.75, f 2 = f[x 2 m = f (0.0) = 11.0, / 3 = f[x 3 ( 0)] = /( 0.5) = 
11.75, and / 4 = /[x 4 (0)] = /( 1.25) = 11.9375. 

3. Set the initial velocities of each particle to zero: 

10(0) = n 2 (0) = U3 (0) = u 4 (0) = 0 

Set the iteration number as i — 1 and go to step 4. 

4. (a) Find /W, 1 = -1-5, P| les t ,2 = 0.0, Pbest ,3 = 0.5, P bestA = 1-25, and G bes t = 

1.25. 

(b) Find the velocities of the particles as (by assuming ci = C 2 = 1 and using 
the random numbers in the range (0, 1) as r\ — 0.3294 and n = 0.9542): 

Vj(i) = v j (i - 1) +ri[P bes tj - x j (/' - 1)] 

+ r 2 [Gbest - Xj (i - 1)]; j = l,2,3,4 
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so that 

Ul (l) =0 + 0.3294(-1.5 + 1.5) + 0.9542(1.25 + 1.5) = 2.6241 
v 2 (l) = 0 + 0.3294(0.0 -0.0) + 0.9542(1.25 -0.0) = 1.1927 
n 3 (l) = 0 + 0.3294(0.5 - 0.5) + 0.9542(1.25 - 0.5) = 0.7156 
w 4 (l) = 0 + 0.3294(1.25 - 1.25) + 0.9542(1.25 - 1.25) = 0.0 
(c) Find the new values of * ; (1), j — 1 ,2,3 ,4, as xj (i) = Xj (i — 1) + Vj (i ) : 
jci(I) = -1.5 + 2.6241 = 1.1241 
* 2 (1) = 0.0+ 1.1927 = 1.1927 
* 3 (1) = 0.5 + 0.7156 = 1.2156 
* 4 (1) = 1.25 + 0.0 = 1.25 

5. Evaluate the objective function values at the current xj(i): 

/[*i( 1)] = 11.9846, /[ jc 2 (1)] = 11-9629, /[* 3 (1)] = 11.9535, 

/(* 4 (1)] = 11.9375 

Check the convergence of the current solution. Since the values of xj(i) did 
not converge, we increment the iteration number as i = 2 and go to step 4. 

4. (a) Find P best ,i = 1.1241, P best , 2 = 1.1927, P besti3 = 1.2156, P b est.4 = 1-25, 
and G bes t = 1.1241. 

(b) Compute the new velocities of particles (by assuming ci = c 2 — 1 and using 
the random numbers in the range (0, 1) as r\ — 0.1482 and r 2 — 0.4867): 

Vj(i) = Vj{i - 1) + ri(P bes t,; - Xj(i)) + r 2 (G b est - Xj(i)); j = 1, 2, 3, 4 
so that 

m(2) = 2.6240 + 0.1482(1.1241 - 1.1241) + 0.4867(1.1241 - 1.1241) =2.6240 

v 2 (2) = 1.1927 + 0.1482(1.1927 - 1.1927) + 0.4867(1.1241 - 1.1927) = 1.1593 

d 3 (2) = 0.7156 + 0.1482(1.2156 - 1.2156) + 0.4867(1.1241 - 1.2156) = 0.6711 

v 4 (2) =0.0 + 0.1482(1.25 - 1.25) + 0.4867(1.1241 - 1.25) = -0.0613 

(c) Compute the current values of Xj(i') as xj(i ) — xj(i — 1) + Vj (/), j = 1, 2, 3, 4: 

*i(2) = 1.1241 +2.6240 = 3.7481 
* 2 (2) = 1.1927 + 1.1593 = 2.3520 
* 3 (2) = 1.2156 + 0.6711 = 1.8867 
* 4 (2) = 1.25 -0.0613 = 1.1887 
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6 . Find the objective function values at the current Xj(i): 

/[x, (2)] =4.4480, /[x 2 (2)] = 10.1721, /[jc 3 (2)] = 1 1.2138, 
f\to (2)] = 11.9644 

Check the convergence of the process. Since the values of xj (i) did not con- 
verge, we increment the iteration number as i = 3 and go to step 4. Repeat 
step 4 until the convergence of the process is achieved. 

13.5 ANT COLONY OPTIMIZATION 
13.5.1 Basic Concept 

Ant colony optimization (ACO) is based on the cooperative behavior of real ant 
colonies, which are able to find the shortest path from their nest to a food source. 
The method was developed by Dorigo and his associates in the early 1990s [13.31, 
13.32]. The ant colony optimization process can be explained by representing the opti- 
mization problem as a multilayered graph as shown in Fig. 13.3, where the number of 



Figure 13.3 Graphical representation of the ACO process in the form of a multi-layered 
network. 
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layers is equal to the number of design variables and the number of nodes in a par- 
ticular layer is equal to the number of discrete values permitted for the corresponding 
design variable. Thus each node is associated with a permissible discrete value of a 
design variable. Figure 13.3 denotes a problem with six design variables with eight 
permissible discrete values for each design variable. 

The ACO process can be explained as follows. Let the colony consist of N ants. 
The ants start at the home node, travel through the various layers from the first layer to 
the last or final layer, and end at the destination node in each cycle or iteration. Each 
ant can select only one node in each layer in accordance with the state transition rule 
given by Eq. (13.32). The nodes selected along the path visited by an ant represent 
a candidate solution. For example, a typical path visited by an ant is shown by thick 
lines in Fig. 13.3. This path represents the solution {xn, X 23 , *31, X45 , X56, 17,4). 
Once the path is complete, the ant deposits some pheromone on the path based on 
the local updating rule given by Eq. (13.33). When all the ants complete their paths, 
the pheromones on the globally best path are updated using the global updating rule 
described by Eqs. (13.32) and (13.33). 

In the beginning of the optimization process (i.e., in iteration 1), all the edges or 
rays are initialized with an equal amount of pheromone. As such, in iteration 1, all the 
ants start from the home node and end at the destination node by randomly selecting 
a node in each layer. The optimization process is terminated if either the prespecified 
maximum number of iterations is reached or no better solution is found in a prespecified 
number of successive cycles or iterations. The values of the design variables denoted 
by the nodes on the path with largest amount of pheromone are considered as the 
components of the optimum solution vector. In general, at the optimum solution, all 
ants travel along the same best (converged) path. 


13.5.2 Ant Searching Behavior 


An ant k, when located at node i, uses the pheromone trail ty to compute the probability 
of choosing j as the next node: 



j£N. 


(*) 


ifjeN f 

ifjiN f 


(13.32) 


where a denotes the degree of importance of the pheromones and N- k) indicates the 
set of neighborhood nodes of ant k when located at node i. The neighborhood of node 
i contains all the nodes directly connected to node i except the predecessor node (i.e., 
the last node visited before /). This will prevent the ant from returning to the same node 
visited immediately before node i. An ant travels from node to node until it reaches 
the destination (food) node. 


13.5.3 Path Retracing and Pheromone Updating 

Before returning to the home node (backward node), the kth ant deposits Ar (t) of 
pheromone on arcs it has visited. The pheromone value Xy on the arc (/, j) traversed 
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is updated as follows: 

ty^Ty + Ar® (13.33) 

Because of the increase in the pheromone, the probability of this arc being selected by 
the forthcoming ants will increase. 

13.5.4 Pheromone Trail Evaporation 

When an ant k moves to the next node, the pheromone evaporates from all the arcs ij 
according to the relation 


ty (1 - p)Xij\ V(z , j) e A (13.34) 

where p e (0, 1] is a parameter and A denotes the segments or arcs traveled by ant k 
in its path from home to destination. The decrease in pheromone intensity favors the 
exploration of different paths during the search process. This favors the elimination of 
poor choices made in the path selection. This also helps in bounding the maximum 
value attained by the pheromone trails. An iteration is a complete cycle involving ant’s 
movement, pheromone evaporation and pheromone deposit. 

After all the ants return to the home node (nest), the pheromone information is 
updated according to the relation 


= (1 - P) x a + YL Ar '! 


(*) 

ij 


(13.35) 


k=\ 


where p e (0, 1] is the evaporation rate (also known as the pheromone decay factor) 
and A t--' 1 is the amount of pheromone deposited on arc ij by the best ant k. The 
goal of pheromone update is to increase the pheromone value associated with good or 
promising paths. The pheromone deposited on arc ij by the best ant is taken as 

Arf = j- (13.36) 


where Q is a constant and is the length of the path traveled by the Ath ant (in the case 
of the travel from one city to another in a traveling salesman problem). Equation (13.36) 
can be implemented as 


Ar 




S’/best _ 
/worst 


if ( i , /) e global best tour 


0; otherwise 


(13.37) 


where / WO rst is the worst value and /b es t is the best value of the objective function 
among the paths taken by the N ants, and ij is a parameter used to control the scale of 
the global updating of the pheromone. The larger the value of ij, the more pheromone 
deposited on the global best path, and the better the exploitation ability. The aim of 
Eq. (13.37) is to provide a greater amount of pheromone to the tours (solutions) with 
better objective function values. 
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13.5.5 Algorithm 

The step-by-step procedure of ACO algorithm for solving a minimization problem can 

be summarized as follows: 

Step 1: Assume a suitable number of ants in the colony ( N ). Assume a set of 
permissible discrete values for each of the n design variables. Denote the 
permissible discrete values of the design variable x, as xn, x, 2 , . . . , x- lp 
(i = 1,2...., n ) . Assume equal amounts of pheromone z-- 1 ' initially along all 
the arcs or rays (discrete values of design variables) of the multilayered graph 
shown in Fig. 13.3. The superscript to Ty denotes the iteration number. For 
simplicity, z -- 1 1 = 1 can be assumed for all arcs ij. Set the iteration number 
/ = 1 . 

Step 2: 

(a) Compute the probability (py) of selecting the arc or ray (or the discrete 
value) Xjj as 

r (,) 

Pij = —r l i ' = 1,2, ...,«; j = 1,2, . . . , p (13.38) 

T r (0 

im 

m = 1 

which can be seen to be same as Eq. (13.32) with a = 1. A larger value 
can also be used for a. 

(b) The specific path (or discrete values) chosen by the Ath ant can be deter- 
mined using random numbers generated in the range (0, 1). For this, we 
find the cumulative probability ranges associated with different paths of 
Fig. 13.3 based on the probabilities given by Eq. (13.38). The specific 
path chosen by ant k will be determined using the roulette -wheel selection 
process in step 3(a). 

Step 3: 

(a) Generate N random numbers r\, r 2 , . . . , rjy in the range (0, 1), one for each 
ant. Determine the discrete value or path assumed by ant k for variable i 
as the one for which the cumulative probability range [found in step 2(b)] 
includes the value r,-. 

(b) Repeat step 3(a) for all design variables i = 1,2 

(c) Evaluate the objective function values corresponding to the complete 
paths (design vectors X® or values of xy chosen for all design variables 
i — 1, 2, .... n by ant k,k — 1,2, , N)\ 

= k= 1,2,..., A (13.39) 

Determine the best and worst paths among the N paths chosen by different 
ants: 


f^ = k = 1,2,..., N {fk} (13 ' 40) 

/worst = k =l , I”. ,.,N {fk ^ (13 ' 41) 
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Step 4: Test for the convergence of the process. The process is assumed to have con- 
verged if all N ants take the same best path. If convergence is not achieved, 
assume that all the ants return home and start again in search of food. Set the 
new iteration number as 1=1 + 1 , and update the pheromones on different 
arcs (or discrete values of design variables) as 

r-P =rf d) + J2 Ar iP (13.42) 

k 

where r r * old) denotes the pheromone amount of the previous iteration left after 
evaporation, which is taken as 

rf ' d) = (1 - p)t? _1) (13.43) 

and At® is the pheromone deposited by the best ant k on its path and the 
summation extends over all the best ants k (if multiple ants take the same best 
path). Note that the best path involves only one arc ij (out of p possible arcs) 

for the design variable i . The evaporation rate or pheromone decay factor p is 

(k) 

assumed to be in the range 0.5 to 0.8 and the pheromone deposited Ar» is 
computed using Eq. (13.37). 

With the new values of r-!\ go to step 2. Steps 2, 3, and 4 are repeated until 
the process converges, that is, until all the ants choose the same best path. 
In some cases, the iterative process is stopped after completing a prespecified 
maximum number of iterations (/ max ). 

Example 13.5 Find the minimum of the function f (* ) = x 2 — 2x — 1 1 in the range 
(0, 3) using the ant colony optimization method. 

SOLUTION 

Step 1: Assume the number of ants is N — 4. Note that there is only one design 
variable in this example ( n = 1). The permissible discrete values of x = xi 
are assumed, within the range of jci, as (p = 7): 

xu — 0.0, x\2 — 0.5, X 13 = 1.0, *14 = 1.5, *15 = 2.0, *i 6 = 2.5, *n = 3.0 

Each ant can choose any of the discrete values (paths or arcs) x\ j, 
j — 1,2, ...,7 shown in Fig. 13.4. Assume equal amounts of pheromone 
along each of the paths or arcs (ti j) shown in Fig. 13.4. For simplicity, 
n j — 1 is assumed for j = 1,2, ... ,7. 

Set the iteration number as l = 1. 

Step 2: For any ant k, the probability of selecting path (or discrete variable) x\ j is 
given by 


E up 

p = i 
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Xu = 0-0 



Food (Destination) 


Figure 13.4 Possible paths for an ant (possible discrete values of x = x\). 


To select the specific path (or discrete variable) chosen by an ant using a 
random number generated in the range ( 0 , 1 ), cumulative probability ranges are 
associated with different paths of Fig. 13.4 as (using roulette-wheel selection 
process in step 3): 

x\i = (0, i) = (0.0,0.1428), x\2 = (i, f) = (0.1428,0.2857), 

* 13 = (f, |) = (0.2857,0.4286), 

x 14 = (f , f) = (0.4286, 0.5714), jci 5 = (7, f) = (0.5714, 0.7143), 

*16 = (f, f) = (0.7143,0.8571), 

* 17 = (§, 1 ) = (0.8571, 1.0) 


Step 3: Generate four random numbers r, ( i = 1, 2, 3, 4) in the range (0, 1), one for 
each ant as r\ = 0.3122, r 2 = 0.8701. r 3 = 0.4729, and r 4 = 0.6190. Using the 
cumulative probability range (given in step 2 ) in which the value of r,- falls, 
the discrete value assumed (or the path selected in Fig. 13.4) by different ants 
can be seen to be 


ant 1 : * 33 = 1.0; ant 2 : x\-j = 3.0; ant 3 : x\ 4 — 1.5; ant 4 : *15 = 2.0 


The objective function values corresponding to the paths chosen by different 
ants are given by 


ant 1 :/i = /(*i 3 ) = /( 1 . 0 ) = - 12 . 0 ; ant 2: f 2 = f(xn) = /( 3.0) = - 8 . 0 ; 
ant 3 :/ 3 = f(x 14 ) = /( 1.5) = -11.75; ant 4 : f 4 = /(* 15 ) = /( 2.0) = -11.0 


It can be seen that the path taken by ant 1 is the best one (with minimum value 
of the objective function): Xb es t = *13 = 1 . 0 , /b es t = /1 = — 12 . 0 ; and the path 
taken by ant 2 is the worst one (with maximum value of the objective function): 
.t'worsl = *17 — 3.0, _/worst — fl — 8.0. 
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Step 4: Assuming that the ants return home and start again in search of food, we set 
the iteration number as l — 2. We need to update the pheromone array as 

T i ( ? = r 5 ld) + E ArW (Ei) 

k 

where J] At® is the pheromone deposited by the best ant k and the summation 

k 

extends over all the best ants k (if multiple ants take the best path). In the 
present case, there is only one best ant, k = 1, which used the path x 1 3 . Thus 
the value of ^ At® can be determined in this case as 

k 


k 


if/best 

/worst 


( 2 ) (- 12 . 0 ) 
(- 8 . 0 ) 


where the scaling parameter g is assumed to be 2. Using a pheromone decay 
factor of p — 0.5 in Eq. (13.43), t 1 ( ° ld) can be computed as 

tj ( ° ld) = (1 - 0.5)rlf = 0.5(1. 0) = 0.5; j = 1, 2, 4, 5, 6 , 7 
Thus Eq. (Ej) gives 

r If = 1.0 + 3.0 = 4.0 for j = 3 and t® = 0.5 for j = 1, 2,4, 5, 6 , 7 
With this, we go to step 5. 

Step 2: For any ant k, the probability of selecting path X\j in Fig. 13.4 is given by 

= 7 = 1,2,... ,7 

E up 

p = 1 

where T\j = 0.5; j — 1, 2, 4, 5, 6 , 7 and 113 = 4. This gives 

0.5 4.0 

p lf = — = 0.0714; j = 1,2, 4, 5, 6, 7; pi 3 = — =0.5714 
7 0 7 J u 7.0 

To determine the discrete value or path selected by ant using a random num- 
ber selected in the range ( 0 , 1 ), cumulative probabilities are associated with 
different paths as (roulette wheel selection process): 


Mi = (0,0.0714), M2 = (0.0714,0.1429), m 3 = (0.1429,0.7143), 
x u = (0.7143,0.7857), x 15 = (0.7857, 0.8571), x 16 = (0.8571, 0.9286), 
M7 = (0.9286, 1.0) 


Step 3: Generate four random numbers in the range (0, 1), one for each of the ants 
as ri = 0.3688, m = 0.8577, r 3 = 0.0776, r 4 = 0.5791. Using the cumulative 
probability range (given in step 2 ) in which the value of r,- falls, the discrete 
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value assumed (or the path selected in Fig. 13.4) by different ants can be seen 
to be 


ant 1 : X13 = 1.0; ant 2 : xi6 = 2.5; ant 3 : x\\ = 0.0; ant 4 : X13 = 1.0 


This shows that two ants (probabilistically) selected the path *13 due to higher 
pheromone left on the best path (x \ 3 ) found in the previous iteration. The 
objective function values corresponding to the paths chosen by different ants 
are given by 

ant 1 : /1 = /(x B ) = /( 1.0) = -12.0; ant 2 : f 2 = /(x l6 ) = /( 2.5) = -9.75; 

ant 3 : / 3 = f(x n ) = /( 0.0) = -11.0; ant 4 : f 4 = f(x n ) = /( 1.0) = -12.0 

It can be seen that the path taken by ants 1 and 4 is the best one with 

*best = *13 = 1-0 and / best = fi = Ia = -12.0 
and the path taken by ant 2 is the worst one with 


*worst — *16 — 2.5 and /worst — fi — 9.75 


Now we go to step 4 to update the pheromone values on the various paths. 
Step 4: Assuming that the ants return home and start again in search of food, we set 
the iteration number as Z = 3. We need to update the pheromone array as 

T 5 ) = ^ d) + E A r W (E 2 ) 

k 

where &r (/:) is the pheromone deposited by the best ant k and the summation 

k 

extends over all the best ants k (if multiple ants take the best path). In the 
present case, there are two best ants, k — 1 and 4, which used the path x 1 3 . 
Thus the value of J] Ar (i) can be determined in this case as 

k 


E Ar 


w 


At 


(*=i) 


-f Ar 


(*= 4 ) 


2s'/best (2) (2) (—12.0) 


/w 


(-9-75) 


= 4.9231 


where the scaling parameter g is assumed to be 2. Using a pheromone decay 
factor of p — 0.5 in Eq. (13.43), ri ( ° ld) can be computed as 

T j ( ° ld) = (1.0 - 0.5 )t® = 0.5(0.5) = 0.25; j = 1, 2, 4, 5, 6, 7 
Thus Eq. (Ei) gives 

t® = 4.0 + 4.9231 = 8.9231 for j = 3 and 
r® =0.25 for j = 1,2. 4, 5, 6, 7 


With this, we go to step 2. 
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Step 2: For any ant k, the probability of selecting path x\j in Fig. 13.4 is given by 

= j ~ 1, 2, . . . , 7 

E o P 

p = i 

where T\j = 0.25; j — 1, 2, 4, 5, 6, 7 and ti 3 = 8.9231. This gives 

Pij = hM§t = 0.0240, 7 = 1. 2, 4, 5, 6, 7; p 13 = = 0.8561 

To determine the discrete value or path selected by an ant using a random num- 
ber selected in the range (0, 1), cumulative probabilities are associated with 
different paths as (roulette-wheel selection process): 

x n = (0,0.0240), x 12 = (0.0240,0.0480), *i 3 = (0.0480,0.9040), 

x | 4 = (0.9040, 0.9280), *i 5 = (0.9280, 0.9520), 

x 16 = (0.9520, 0.9760), x xi = (0.9760, 1.0) 

With this information, we go to step 3 and then to step 4. Steps 2, 3, and 4 
are repeated until the process converges (until all the ants choose the same 
best path). 


13.6 OPTIMIZATION OF FUZZY SYSTEMS 

In traditional designs, the optimization problem is stated in precise mathematical terms. 
Flowever, in many real-world problems, the design data, objective function, and con- 
straints are stated in vague and linguistic terms. For example, the statement, “This beam 
carries a load of 10001b with a probability of 0.8” is imprecise because of random- 
ness in the material properties of the beam. On the other hand, the statement, “This 
beam carries a large load” is imprecise because of the fuzzy meaning of “large load.” 
Similarly, in the optimum design of a machine component, the induced stress ( a ) is con- 
strained by an upper bound value (<x max ) as rr < er max . If er max = 30,000 psi, it implies 
that a design with a = 30, 000 psi is acceptable whereas a design with a = 30, 001 
psi is not acceptable. However, there is no substantive difference between designs with 
a — 30, 000 psi and a = 30, 001 psi. It appears that it is more reasonable to have a 
transition stage from absolute permission to absolute impermission. This implies that 
the constraint is to be stated in fuzzy terms. Fuzzy theories can be used to model and 
design systems involving vague and imprecise information [13.22, 13.26, 13.27]. 

13.6.1 F uzzy Set T heory 

Let X be a classical crisp set of objects, called the universe, whose generic elements are 
denoted by x. Membership in a classical subset A of X can be viewed as a characteristic 
function /r from X to [0, 1] such that 


( 1 if x e A 
~~ I 0 if x £ A 


(13.44) 
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The set [0, 1] is called a valuation set. A set A is called a. fuzzy set if the valuation set 
is allowed to be the whole interval [0, 1]. The fuzzy set A is characterized by the set 
of all pairs of points denoted as 

A = {x, ma (■*)}, xeX (13.45) 

where im a (x) is called the membership function of x in A. The closer the value of 
pt A {x) is to 1, the more x belongs to A. For example, let X = {62 64 66 68 70 
72 74 76 78 80} be possible temperature settings of the thermostat (°F) in an 
air-conditioned building. Then the fuzzy set A of “comfortable temperatures for human 
activity” may be defined as 

A = {(62, 0.2) (64,0.5) (66,0.8) (68,0.95) (70,0.85) (72,0.75) 

(74,0.6) (76,0.4) (78,0.2) (80,1.0)} (13.46) 

where a grade of membership of 1 implies complete comfort and 0 implies complete 
discomfort. In general, if X is a finite set, [x\, X 2 , . . . , x„ } the fuzzy set on X can be 
expressed as 


n 

A = fx, A (xi)\ Xl + ii A (x 2 )\ X2 H \-li A (x n )\ Xn = 'Y^ii A {xi)\ Xi (13.47) 

i=t 


or in the limit, we can express A as 


A = 


L 


MaMI.< 


(13.48) 


Crisp set theory is concerned with membership of precisely defined sets and is 
suitable for describing objective matters with countable events. Crisp set theory is 
developed using binary statements and is illustrated in Fig. 13.5a, which shows the 
support for \>| with no ambiguity. Since fuzzy set theory is concerned with linguistic 
statements of support for membership in imprecise sets, a discrete fuzzy set is denoted 
as in Fig. 13.56, where the degree of support is shown by the membership values, /x i, 
H 2 , ■ ■ ■ , pt„, corresponding to yi,y 2 , ... ,y n , respectively. The discrete fuzzy set can 
be generalized to a continuous form as shown in Fig. 13.5c. 

The basic crisp set operations of union, intersection, and complement can be rep- 
resented on Venn diagrams as shown in Fig. 13.6. Similar operations can be defined 
for fuzzy sets, noting that the sets A and B do not have clear boundaries in this case. 
The graphs of pt A and fi B can be used to define the set-theoretic operations of fuzzy 
sets. The union of the fuzzy sets A and B is defined as 


MaubOO = IX A (y) V mbOO = ma x[p. A (y), mb O')] 

_ JmaO) if Ma > Mb 
}mbO) if Ma < M b 


(13.49) 
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Figure 13.5 Crisp and fuzzy sets: (a) crisp set; ( b ) discrete fuzzy set; (c) continuous fuzzy 
set. [13.22], with permission of ASME. 





Figure 13.6 Basic set operations in crisp set theory: (a) A or B or both: A LJ B: (h) A and 
B : A 0 B; (c) not A : A. [13.22], with permission of ASME. 


The result of this operation is shown in Fig. 13.7a. The intersection of the fuzzy sets 
A and B is defined as 

BAnBiy) = HA(y) A MbO) = min[/r A (y), /r B (y)] 

f,^(y) if ba<^b (T3.50) 

[fi B (y) if ba>bb 

This operation is shown in Fig. 13.7h. The complement of a fuzzy set A is shown as A 
in Fig. 13.7c, in which for every /r A (y), there is a corresponding /r^-(y) = 1 — /r^Cy), 
which defines the complement of the set A, A. 
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(«) 


(b) 


(c) 


Figure 13.7 Basic set operations in fuzzy set theory: (a) union; (b) intersection; (c) comple- 
ment. [13.22], with permission of ASME. 


13.6.2 Optimization of Fuzzy Systems 

The conventional optimization methods deal with selection of the design variables that 
optimizes an objective function subject to the satisfaction of the stated constraints. 
For a fuzzy system, this notion of optimization has to be revised. Since the objective 
and constraint functions are characterized by the membership functions in a fuzzy 
system, a design (decision) can be viewed as the intersection of the fuzzy objective 
and constraint functions. For illustration, consider the objective function: “The depth of 
the crane girder (x) should be substantially greater than 80 in.” This can be represented 
by a membership function, such as 




| o 

|[i + (x-8or 2 r 1 


if x < 80 in. 
if x > 80 in. 


(13.51) 


Let the constraint be “The depth of the crane girder (x) should be in the vicinity of 
83 in.” This can be described by a membership function of the type 

/Xg(x) — [1 T (x — 8 3 ) 4 ] 1 (13.52) 


Then the design (decision) is described by the membership function, fi D (x), as 


/r D (x) = fx f (x) A n g (x) 


0 x < 80 in. 

min{[l + (x - 80) -2 ] -1 , [1 + (x - 83) 4 ]- 1 } 


if x > 80 in. 

This relationship is shown in Fig. 13.8. 

The conventional optimization problem is usually stated as follows: 

Find X which minimizes /(X) 


(13.53) 


subject to 

gf < *;(X) < gf\ j = 1,2, ... ,m (13.54) 

where the superscripts / and u denote the lower and upper bound values, respectively. 
The optimization problem of a fuzzy system is stated as follows: 


Find X which minimizes /(X) 
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/x 



Figure 13.8 Concept of fuzzy decision. [13.22], with permission of ASME. 


subject to 


gj (X)eGj, j = 1,2, m (13.55) 

where Gj denotes the fuzzy interval to which the function gj(X) should belong. Thus 
the fuzzy feasible region, S, which denotes the intersection of all Gj, is defined by the 
membership function 


V> s(X) = min {/t-G; L? j (X)]} (13.56) 

7 = 1 , 2 , ...,m 

Since a design vector X is considered feasible when /rs(X) > 0, the optimum design is 
characterized by the maximum value of the intersection of the objective function and 
the feasible domain: 


(X*) = max /td(X), X e D 


(13.57) 


where 

Md(X) = min l/x/(X), min /z G [g ; (X)] 

;=l,2,...,m 


(13.58) 


13.6.3 Computational Procedure 

The solution of a fuzzy optimization problem can be determined once the membership 
functions of / and gj are known. In practical situations, the constructions of the mem- 
bership functions is accomplished with the cooperation and assistance of experienced 
engineers in specific cases. In the absence of other information, linear membership 
functions are commonly used, based on the expected variations of the objective and 
constraint functions. Once the membership functions are known, the problem can be 
posed as a crisp optimization problem as 


Find X and X which maximize X 
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subject to 

*<M/(X) 

*<M g C) (X) , 7 = 1,2, ...,m 

^ < Mg(“) (X) , 7 = 1,2, ...,m (13.59) 


13.6.4 Numerical Results 

The minimization of the error between the generated and specified outputs of the 
four-bar mechanism shown in Fig. 13.9 is considered. The design vector is taken as 
X = {a b c L! p } T . The mechanism is constrained to be a crank-rocker mecha- 
nism so that 

a — b < 0, a — c < 0, a < 1 
d = [{ci + c) — (b + l)][(c — a) 2 — (b — 1 ) 2 ] <0 

The maximum deviation of the transmission angle (/z) from 90° is restricted to be less 
than a specified value, t max = 35°. The specified output angle is 

f 20° + t, 0 ° <<(> < 240° 

e s (<P) = “ “ 

[ unspecified, 240 < 4> < 360 

Linear membership functions are assumed for the response characteristics [13.22]. The 
optimum solution is found to be X = [0.2537 0.8901 0.8865 —0.7858 — 1.0} T 

with /* = 1.6562 and /,* = 0.4681. This indicates that the maximum level of satisfac- 
tion that can be achieved in the presence of fuzziness in the problem is 0.4681. The 
transmission angle constraint is found to be active at the optimum solution [13.22]. 


13.7 NEURAL-NETWORK-BASED OPTIMIZATION 

The immense computational power of nervous system to solve perceptional problems 
in the presence of massive amount of sensory data has been associated with its parallel 



Figure 13.9 Four-bar function generating mechanism. 
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processing capability. The neural computing strategies have been adopted to solve 
optimization problems in recent years [13.23, 13.24], A neural network is a massively 
parallel network of interconnected simple processors (neurons) in which each neuron 
accepts a set of inputs from other neurons and computes an output that is propagated 
to the output nodes. Thus a neural network can be described in terms of the individual 
neurons, the network connectivity, the weights associated with the interconnections 
between neurons, and the activation function of each neuron. The network maps an 
input vector from one space to another. The mapping is not specified but is learned. 

Consider a single neuron as shown in Fig. 13.10. The neuron receives a set of 
n inputs, x,-, / = 1,2, from its neighboring neurons and a bias whose value 
is equal to 1. Each input has a weight (gain) w : associated with it. The weighted 
sum of the inputs determines the state or activity of a neuron, and is given by a = 
WjXi = W r X, where X = {x\X 2 ■ ■ ■ x n 1 ( 1 . A simple function is now used to 
provide a mapping from the n -dimensional space of inputs into a one-dimensional 
space of the output, which the neuron sends to its neighbors. The output of a neuron is 
a function of its state and can be denoted as f{a). Usually, no output will be produced 
unless the activation level of the node exceeds a threshold value. The output of a neuron 
is commonly described by a sigmoid function as 

m = (13 - 6 °) 

1 + e a 

which is shown graphically in Fig. 13.10. The sigmoid function can handle large as 
well as small input signals. The slope of the function f(a) represents the available 
gain. Since the output of the neuron depends only on its inputs and the threshold value, 
each neuron can be considered as a separate processor operating in parallel with other 
neurons. The learning process consists of determining values for the weights Wj that 
lead to an optimal association of the inputs and outputs of the neural network. 




Figure 13.10 Single neuron and its output. [12.23], reprinted with permission of Gordon & 
Breach Science Publishers. 
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Several neural network architectures, such as the Hopfield and Kohonen networks, 
have been proposed to reflect the basic characteristics of a single neuron. These archi- 
tectures differ one from the other in terms of the number of neurons in the network, 
the nature of the threshold functions, the connectivities of the various neurons, and 
the learning procedures. A typical architecture, known as the multilayer feedforward 
network, is shown in Fig. 13.11. In this figure the arcs represent the unidirectional 
feedforward communication links between the neurons. A weight or gain associated 
with each of these connections controls the output passing through a connection. The 
weight can be positive or negative, depending on the excitatory or inhibitory nature 
of the particular neuron. The strengths of the various interconnections (weights) act as 
repositories for knowledge representation contained in the network. 

The network is trained by minimizing the mean-squared error between the actual 
output of the output layer and the target output for all the input patterns. The error is 
minimized by adjusting the weights associated with various interconnections. A number 
of learning schemes, including a variation of the steepest descent method, have been 
used in the literature. These schemes govern how the weights are to be varied to 
minimize the error at the output nodes. For illustration, consider the network shown 
in Fig. 13.12. This network is to be trained to map the angular displacement and 
angular velocity relationships, transmission angle, and the mechanical advantage of a 
four-bar function-generating mechanism (Fig. 13.9). The inputs to the five neurons in 
the input layer include the three link lengths of the mechanism (j 2 , r$, and rf) and the 
angular displacement and velocities of the input link (62 and 002 )- The outputs of the six 
neurons in the output layer include the angular positions and velocities of the coupler 
and the output links (# 3 , 0)3 , O 4 , and wf), the transmission angle (y), and the mechanical 


Outputs 



Inputs 

Figure 13.11 Multilayer feedforward network. [13.23], reprinted with permission of Gordon 
and Breach Science Publishers. 
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t t t t t 


Figure 13.12 Network used to train relationships for a four-bar mechanism. [12.23], reprinted 
with permission of Gordon & Breach Science Publishers. 

advantage (?/) of the mechanism. The network is trained by inputting several possible 
combinations of the values of t~ 2 , O, r^, 66, and a >2 and supplying the corresponding 
values of 03, 04 , u>y, C 04 , y, and t/. The difference between the values predicted by the 
network and the actual output is used to adjust the various interconnection weights 
such that the mean-squared error at the output nodes is minimized. Once trained, the 
network provides a rapid and efficient scheme that maps the input into the desired 
output of the four-bar mechanism. It is to be noted that the explicit equations relating 
rn, G3, /'4. On, and a >2 and the output quantities 6*3, 04 , 024 , 0 ) 4 , y, and r] have not been 
programmed into the network; rather, the network learns these relationships during the 
training process by adjusting the weights associated with the various interconnections. 
The same approach can be used for other mechanical and structural analyses that might 
require a finite-element-based computations. 

Numerical Results. The minimization of the structural weight of the three-bar 
truss described in Section 7.22.1 (Fig. 7.21) was considered with constraints on 
the cross-sectional areas and stresses in the members. Two load conditions were 
considered with P — 20,0001b, E = 10 x 10 6 psi, p — 0.1 lb/in 3 , H — 100 in., cr min = 
— 15,000psi, or max = 20,000 psi, A { [ ] = 0.1 in 2 (;' = 1, 2), and A ( f ] = 5.0in 2 (/ = 1 , 2). 
The solution obtained using neural-network-based optimization is [12.23]: 
x* = 0.788 in 2 , x* = 0.4079in 2 , and f* — 26.3716 lb. This can be compared 
with the solution given by nonlinear programming: x* = 0.7745 in 2 , x| = 0.4499 in 2 , 
and /* = 26.405 Tib. 
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REVIEW QUESTIONS 

13.1 Define the following terms: 

(a) Fuzzy parameter 

(b) Annealing 
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(c) Roulette wheel selection process 

(d) Pheromone evaporation rate 

(e) Neural network 

(f) Fuzzy feasible domain 

(g) Membership function 

(h) Multilayer feedforward network 

13.2 Match the following terms: 

(a) Fuzzy optimization 

(b) Genetic algorithms 

(c) Neural network method 

(d) Simulated annealing 

(e) Particle swarm optimization 

(f) Ant colony optimization 

13.3 Answer true or false: 

(a) GAs can be used to solve problems with continuous design variables. 

(b) GAs do not require derivatives of the objective function. 

(c) Crossover involves swapping of the binary digits between two strings. 

(d) Mutation operator is used to produce offsprings. 

(e) No new strings are formed in the reproduction stage in GAs. 

(f) Simulated annealing can be used to solve only discrete optimization problems. 

(g) Particle swarm optimization is based on cognitive and social learning rates of groups 
of birds. 

(h) Particle swarm optimization method uses the positions and velocities of particles. 

(i) Genetic algorithms basically maximize an unconstrained function. 

(j ) Simulated annealing basically solves an unconstrained optimization problem. 

(k) GAs seek to find a better design point from a trial design point. 

(l) GAs can solve a discrete optimization problem with no additional effort. 

(m) SA is a type of random search technique. 

(n) GAs and SA can find the global minimum with high probability. 

(o) GAs are zeroth-order methods. 

(P) Discrete variables need not be represented as binary strings in GAs. 

(q) SA will find a local minimum if the feasible space is nonconvex. 

(r) The expressions relating the input and output are to be programmed in neural- 
network-based methods. 

(s) Several networks architectures can be used in neural-network-based optimization. 

(t) A fuzzy quantity is same as a random quantity. 

(U) Ant colony optimization solves only discrete optimization problems. 

(v) Fuzzy optimization involves the maximization of the intersection of the objective 
function and feasible domain. 


Based on shortest path 

Analysis equations not programmed 

Linguistic data can be used 

Based on the behavior of a flock of birds 

Based on principle of survival of the fittest 

Based on cooling of heated solids 


13.4 Give brief answers: 

(a) What is Boltzmann's probability distribution? 
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(b) How is an inequality constrained optimization problem converted into an uncon- 
strained problem for use in GAs? 

(c) What is the difference between a crisp set and a fuzzy set? 

(d) How is the output of a neuron described commonly? 

(e) What are the basic operations used in GAs? 

(f) What is a fitness function in GAs? 

(g) Can you consider SA as a zeroth-order search method? 

(h) How do you select the length of the binary string to represent a design variable? 

(i) Construct the objective function to be used in GAs for a minimization problem with 
mixed equality and inequality constraints. 

(j ) How is the crossover operation performed in GAs? 

(k) What is the purpose of mutation? How is it implemented in GAs? 

(l) What is the physical basis of SA? 

(m) What is metropolis criterion and where is it used? 

(n) What is a neural network? 

(o) How is a neuron modeled in neural-network-based models? 

(p) What is a sigmoid function? 

(q) How is the error in the output minimized during network training? 

(r) What is the difference between a random quantity and a fuzzy quantity? 

(s) Give two examples of design parameters that can be considered as fuzzy. 

(t) What is a valuation set? 

(u) What is the significance of membership function? 

(v) Define the union of two fuzzy sets A and B ? 

(w) How is the intersection of two fuzzy sets A and B defined? 

(x) Show the complement of a fuzzy set in a Venn diagram. 

(y) How is the optimum solution defined in a fuzzy environment? 

(z) How is the fuzzy feasible domain defined for a problem with inequality constraints? 


PROBLEMS 

13.1 Consider the following two strings denoting the vectors Xi and X 2 : 

X, : {1 0 0 0 1 0 1 1 0 1} 

X 2 : {0 1 1 1 1 10 1 10} 

Find the result of crossover at location 2. Also, determine the decimal values of the 
variables before and after crossover if each string denotes a vector of two variables. 

13.2 Two discrete fuzzy sets, A and B are defined as follows: 

A ={(60,0.1) (62,0.5) (64,0.7) (66,0.9) (68,1.0) (70,0.8)} 

fl = {(60,0.0) (62,0.2) (64,0.4) (66,0.8) (68,0.9) (70,1.0)} 

Determine the union and intersection of these sets. 


13.3 Determine the size of the binary string to be used to achieve an accuracy of 0.01 for a 
design variable with the following bounds: 


(a) x (/) = 0, x <H) = 5 

(b) x (/) = 0,x fu) = 10 

(c) x (,) = 0, x (l,) = 20 
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13.4 

13.5 

13.6 

13.7 

13.8 

13.9 
13.10 


13.11 


13.12 


13.13 


A design variable, with lower and upper bounds 2 and 13, respectively, is to be repre- 
sented with an accuracy of 0.02. Determine the size of the binary string to be used. 

Find the minimum of / = x 5 — 5 x 3 — 20 x + 5 in the range (0, 3) using the ant colony 
optimization method. Show detailed calculations for 2 iterations with 4 ants. 

In the ACO method, the amounts of pheromone along the various arcs from node i 
are given by r y = 1, 2, 4. 3, 5, 2 for j = 1, 2, 3, 4, 5, 6, respectively. Find the arc ( ij ) 
chosen by an ant based on the roulette-wheel selection process based on the random 
number r = 0.4921. 

Solve Example 13.5 by neglecting pheromone evaporation. Show the calculations for 2 
iterations. 

Find the maximum of the function / = — x 5 + 5 x 3 + 20x — 5 in the range — 4 < x < 4 
using the PSO method. Use 4 particles with the initial positions x\ = — 2, xo = 0, 
JC3 = 1, and X 4 = 3. Show detailed calculations for 2 iterations. 

Solve Example 13.4 using the inertia term when 9 varies linearly from 0.9 to 0.4 in 
Eq. (13.23). 

Find the minimum of the following function using simulated annealing: 


/(X) = 6 x\ — 6x1x2 + 2xf — xi — 2x2 


Assume suitable parameters and show detailed calculations for 2 iterations. 

Consider the following function for maximization using simulated annealing: /(x) = 
x(1.5 — x) in the range (0, 5). If the initial point is x (0) = 2.0, generate a neighboring 
point using a uniformly distributed random number in the range (0, 1). If the temperature 
is 400, find the pbobability of accepting the neighboring point. 

The population of binary strings in a maximization problem is given below: 


String 

Fitness 

0 

0 

1 

1 

0 

0 

8 

0 

1 

0 

1 

0 

1 

12 

1 

0 

1 

0 

1 

1 

6 

1 

1 

0 

0 

0 

1 

2 

0 

0 

0 

1 

0 

0 

18 

1 

0 

0 

0 

0 

0 

9 

0 

1 

0 

1 

0 

0 

10 


Determine the expected number of copies of the best string in the above population in 
the mating pool using the roulette-wheel selection process. 

Consider the following constrained optimization problem: 

Minimize / = x 3 — 6 xj + llxj + X3 
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subject to 

x\ + x 2 ~ x 3 — 0 
4 — xf — x\ — x\ < 0 
* 3 - 5 < 0 
— Xi < 0; i = 1,2,3 

Define the fitness function to be used in GA for this problem. 

problem are given by 
{x\ , X 2 , x 3 } t to achieve 


13.14 The bounds on the design variables in an optimization 
— 10 < x\ < 10, 0 < X 2 < 8, 150 < x 3 < 750 

Find the minimum binary string length of a design vector X = 
an accuracy of 0.01. 
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14.1 INTRODUCTION 

Although the mathematical techniques described in Chapters 3 to 13 can be used 
to solve all engineering optimization problems, the use of engineering judgment and 
approximations help in reducing the computational effort involved. In this chapter we 
consider several types of approximation techniques that can speed up the analysis time 
without introducing too much error [14.1]. 

These techniques are especially useful in finite element analysis-based optimiza- 
tion procedures. The practical computation of the derivatives of static displacements, 
stresses, eigenvalues, eigenvectors, and transient response of mechanical and 
structural systems is presented. The concept of decomposition, which permits the 
solution of a large optimization problem through a set of smaller, coordinated sub- 
problems is presented. The use of parallel processing and computation in the solution 
of large-scale optimization problems is discussed. Many real-life engineering systems 
involve simultaneous optimization of multiple-objective functions under a specified 
set of constraints. Several multiobjective optimization techniques are summarized in 
this chapter. 


14.2 REDUCTION OF SIZE OF AN OPTIMIZATION PROBLEM 
14.2.1 Reduced Basis Technique 

In the optimum design of certain practical systems involving a large number of (n) 
design variables, some feasible design vectors Xi,X 2 ,...,X r may be available to start 
with. These design vectors may have been suggested by experienced designers or may 
be available from the design of similar systems in the past. We can reduce the size of 
the optimization problem by expressing the design vector X as a linear combination of 
the available feasible design vectors as 

X = ciXi + C 2 X 2 + • • • + c r X r (14.1) 

where ci, C 2 , . . . , c r are the unknown constants. Then the optimization problem can 
be solved using ci,C 2 , ...,c r as design variables. This problem will have a much 
smaller number of unknowns since r n. In Eq. (14.1), the feasible design vec- 
tors Xi, X 2 , . . . , X, serve as the basis vectors. It can be seen that if a — C 2 — ■ ■ ■ — 
c r — 1 /r, then X denotes the average of the basis vectors. 
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14.2.2 Design Variable L inking Technique 

When the number of elements or members in a structure is large, it is possible to 
reduce the number of design variables by using a technique known as design variable 
linking [14.25]. To see this procedure, consider the 12-member truss structure shown 
in Fig. 14.1. If the area of cross section of each member is varied independently, we 
will have 12 design variables. On the other hand, if symmetry of members about the 
vertical (T) axis is required, the areas of cross section of members 4, 5, 6, 8, and 10 
can be assumed to be the same as those of members 1, 2, 3, 7, and 9, respectively. 
This reduces the number of independent design variables from 12 to 7. In addition, if 
the cross-sectional area of member 12 is required to be three times that of member 1 1, 
we will have six independent design variables only: 


Xl ' 


' Ai ■ 

*2 


a 2 

X 3 


A3 

X\ 


Ay 

*5 


Ag 

x 6 . 


U11J 


(14.2) 


Once the vector X is known, the dependent variables can be determined as A 4 — 
A\, A5 = A 2, A(, — A3, Ag = At, A10 = A9, and A12 = 3An. This procedure of treat- 
ing certain variables as dependent variables is known as design variable linking. By 
defining the vector of all variables as 

Z T — {zi zi • ■ • zn} T = {Ai A 2 ... Ai 2 } t 


v 
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the relationship between Z and X can be expressed as 


Z 

12x1 


where the matrix [ T | is given by 


m = 


"i 

0 
0 

1 
0 
0 
0 
0 
0 
0 
0 
0 


[T] X 

12x6 6x1 


0 0 0 0 
10 0 0 
0 10 0 
0 0 0 0 
10 0 0 
0 10 0 
0 0 10 
0 0 10 
0 0 0 1 
0 0 0 1 
0 0 0 0 
0 0 0 0 


0“ 

0 

0 

0 

0 

0 

0 

0 

0 

0 

1 

3 


(14.3) 


(14.4) 


The concept can be extended to many other situations. For example, if the geometry 
of the structure is to be varied during optimization (configuration optimization) while 
maintaining (1) symmetry about the Y axis and (2) alignment of the three nodes 2, 3, 
and 4 (and 6, 7, and 4), we can define the following independent and dependent design 
variables: 

Independent variables: X$, Xe, Ye, Y 7 , Y 4 
Dependent variables: 


X, = -x st X 2 = -X 6 , Y 2 = Y 6 , Y 3 = Y-J, X 7 
X 3 = -X 7 , X 4 = 0, Y\ = 0, Y 5 = 0 


Thus the design vector X is 


Xi 


'* 5 ' 

X 2 


^6 

x 3 

■ = ■ 

Ye 

x 4 


Yi 

, X 5 


y 4 


Ya-Yi 

y 4 -y 6 


X 6 , 


(14.5) 


The relationship between the dependent and independent variables can be defined more 
systematically, by defining a vector of all geometry variables, Z, as 

z = {zi z 2 . . .zm} t 


= {^i Yi X 2 Y 2 x 3 Y 3 X 4 Y 4 X 5 Y s X 6 Y 6 X 7 T 7 } t 


which is related to X through the relations 

Zi=fi (X), * = 1,2 14 


(14.6) 


where f\ denotes a function of X . 
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14.3 FAST REANALYSIS TECHNIQUES 
14.3.1 Incremental Response Approach 

Let the displacement vector of the structure or machine, Y o, corresponding to the load 
vector, Pq, be given by the solution of the equilibrium equations 


[X 0 ]Y o = Po 

(14.7) 

Yq = [X 0 ] -1 P 0 

(14.8) 


where [ Xq] is the stiffness matrix corresponding to the design vector, X 0 . When the 
design vector is changed to Xo + AX, let the stiffness matrix of the system change to 
[X 0 ] + [AX], the displacement vector to Yo + AY, and the load vector to Po + AP. 
The equilibrium equations at the new design vector, Xo + AX, can be expressed as 

([X 0 ] + [AX])(Yo + AY) = P 0 + AP (14.9) 


or 

[X 0 ]Y 0 + [AX]Y 0 + [X 0 ]AY + [AX]AY = P 0 + AP (14.10) 

Subtracting Eq. (14.7) from Eq. (14.10), we obtain 

([X 0 ] + [AX])AY = AP-[AX]Y 0 (14.11) 

By neglecting the term [AX'] AY, Eq. (14.11) can be reduced to 

[X 0 ]AY ^ AP - [AX]Y 0 (14.12) 

which yields the first approximation to the increment in displacement vector AY as 

AY , = [Xq] -1 (AP - [AX]Y 0 ) (14.13) 

where [Xo] -1 is available from the solution in Eq. (14.8). We can find a better approx- 
imation of AY by subtracting Eq. (14.12) from Eq. (14.11): 

([X 0 ] + [AX])AY - [X 0 ]AY ] = AP — [AX]Y 0 - (AP - [AX]Y 0 ) (14.14) 


or 


By defining 


([X 0 ] + [AX])(AY - AY 0 = -[AX] AY { 


AY 2 = AY — AY ! 


(14.15) 


(14.16) 


Eq. (14.15) can be expressed as 


([X 0 ] + [AX])AY 2 = -[AX]AY, 


(14.17) 
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Neglecting the term [A// | AYi, Eq. (14.17) can be used to obtain the second approxi- 
mation to AY , AY 2 , as 

AY 2 = — [A r 0 ] _ 1 ([AA r ]AY [) (14.18) 

From Eq. (14.16), AY can be written as 

2 

AY = ^ AY, (14.19) 

i = 1 

This process can be continued and AY can be expressed, in general, as 

oo 

AY = ^ AY, (14.20) 

(=i 

where AY, is found by solving the equations 


[*o]AY; = — [AA'JAY i-i (14.21) 


Note that the series given by Eq. (14.20) may not converge if the change in the 
design vector, AX, is not small. Hence it is important to establish the validity of the 
procedure for each problem, by determining the step sizes for which the series will 
converge, before using it. The iterative process is usually stopped either by specify- 
ing a maximum number of iterations and/or by prescribing a convergence criterion 
such as 


II AY,- 1| 

t AY; 


< £ 


(14.22) 


where ||AY,|| is the Euclidean norm of the vector AY, and s is a small number on 
the order of 0 . 01 . 


Example 14.1 Consider the crane (planar truss) shown in Fig. 14.2. Young’s modulus 
of member e is equal to E e — 30 x 1 0 6 psi (e = 1,2,3, 4), and the other data are 
shown in Table 14.1. Assuming the base design to be A i — At = 2 in . 2 and A 3 = 
A 4 — 1 in. 2 , and perturbations to be AAj = AA 2 = 0.4 in . 2 and AA 3 = AA 4 = 0.2 
in. 2 , determine (a) the exact displacements of nodes 3 and 4 at the base design, (b) the 
displacements of nodes 3 and 4 at the perturbed design using the exact procedure, and, 
(c) the displacements of nodes 3 and 4 at the perturbed design using the approximation 
method. 


SOLUTION The stiffness matrix of a typical element e is given by 


[K w ] 


(An _ 


A e E P 


l h 

Uj m ij 

- 1 2 
IJ 

Uj m ij 


m 2 

-h 


m, 

,2 

ij 

m j 


- m U 


- 1 2 
IJ 

—lijirijj 

l ij 

lij m ij 


-l, 




lljinij 


(Ei) 
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Table 14.1 


Member, 

e 

Area of 
cross 

section, A e 

Length, 
h (in.) 

Global node of: 

Direction cosines of member 

Corner 
1, i 

Corner 

2,7 

; 

,J ~ le 

1 

II 

S 

1 

A 1 

55.9017 

1 

3 

0.8944 

0.4472 

2 

^2 

55.9017 

3 

2 

0.8944 

-0.4472 

3 

a 3 

167.7051 

3 

4 

0.8944 

0.4472 

4 

a 4 

141.4214 

2 

4 

0.7071 

0.7071 


where A e is the cross-sectional area, E e is Young’s modulus, l e is the length, and 
(lij , m ij ) are the direction cosines of member e. Equation (Ei) can be used to compute 
the stiffness matrices of the various members using the data of Table 14. 1. When the 
member stiffness matrices are assembled and the boundary conditions (yi = V2 = >’3 = 
y 4 = 0) are applied, the overall stiffness matrix becomes 


[A - ] = (30 x 10°) 


( 0 . 8 A! 

0.8A 2 

O. 8 A 3 \ 

( 0.4Ai 

0.4A 2 

0.4A 3 \ 

/ - 0 . 8 A 3 \ 

V 55.9017 

+ 55.9017 + 

167.7051 ) 

V 55.9017 

55.9017 + 

167.7051 / 

V 167.7051 / 


f- 

\ 55 


o.2a 3 


9017 55.9017 167.7051 


\ ! -0.4A 3 \ 

/ 1, 167.70501 / 

( 0.8A 3 


0.5A 4 

V 167.7051 ’’ 141.4214 


f — 

V 167. 

(w. 

)(= 


4A 3 


7051 
2A 3 \ 


7051 J 
4A 3 , 


0.5A 4 \ 
141.4214 ) 

0.5A 4 \ 


167.7051 
0.2A 3 

167.7051 T 141.4214 I 
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Thus the equilibrium equations of the structure can be expressed as 

[K] Y = P (E 3 ) 

where 



'ys' 


P5 


0 

Y = 

ye 

and P = 

Pe 



0 


yi 


Pi 


0 


As. 


P8 


-1000 


(a) At the base design, A\ — A 2 — 2 in. 2 , A 3 = A 4 — 1 in. 2 , and the exact solution 
of Eqs. (E 3 ) gives the displacements of nodes 3 and 4 as 


T5' 


0.001165 

T6 


0.002329 

yi 


0.05147 

A8. 

base 

-0.07032 


(b) At the perturbed design, A\ = A 3 = 2.4 in. 2 , A 3 = A 4 = 1.2 in. 2 , and the exact 
solution of Eq. (£3) gives the displacements of nodes 3 and 4 as 


ye 


0.0009705 

ye 


0.001941 

yi 


0.04289 

T8. 

perturb 

-0.05860 


(c) The values of A\ — A 2 — 2.4 in . 2 and A 3 = A 4 — 1.2 in . 2 at the perturbed 
design are used to compute the new stiffness matrix as [K | pe ri mb = [ K | + [AK \, 
which is then used to compute AY 1 , AY 2 , . . . using the approximation proce- 
dure, Eqs. (14.13) and (14.21). The results are shown in Table 14.2. It can be 
seen that the solution given by Eq. (14.20) converged very fast. 


14.3.2 Basis Vector Approach 

In structural optimization involving static response, it is possible to conduct an approx- 
imate analysis at modified designs based on a limited number of exact analysis results. 
This results in a substantial saving in computer time since, in most problems, the num- 
ber of design variables is far smaller than the number of degrees of freedom of the 
system. Consider the equilibrium equations of the structure in the form 


[tf] Y = P 

ffix/n mxl rax 1 


(14.23) 


where [ K \ is the stiffness matrix, Y the vector of displacements, and P the load vector. 
Let the structure have n design variables denoted by the design vector 


X = 


x\ 

*2 


X, 
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Table 14.2 


Exact Y o = 


0.1 16462E — 02 


0.970515E — 03 

0.232923E - 02 

Exact (Y 0 + AY ) = 

0.194103E — 02 

0.514654E — 01 

0.428879E - 01 

— 0.703216E — 01 


— 0.586014E - 01 


Value of i AY , Y ,• = Y o + AY /. 

jt=i 


— 0.232922E - 03 


0.931695E — 03 

— 0.465844E - 03 


0.186339E — 02 

— 0.102930E — 01 


0.41 1724E — 01 

0.140642E — 01 


— 0.562573E - 01 

0.465842E — 04 


0.978279E - 03 

0.931683E — 04 


0.195656E — 02 

0.205859E - 02 


0.432310E — 01 

— 0.281283E — 02 


— 0.590702E - 01 

— 0.931678E — 05 


0.968962E — 03 

— 0.186335E — 04 


0.193792E — 02 

— 0.411716E — 03 


0.428193E — 01 

0.562563E - 03 


— 0.585076E - 01 

0.186335E — 05 


0.970825E — 03 

0.372669E - 05 


0.194165E — 02 

0.823429E - 04 


0.429016E - 01 

— 0.112512E — 03 


— 0.586201E — 01 


If we find the exact solution at r basic design vectors X i , X 2 , . . . , X, ., the corresponding 
solutions, Y , , are found by solving the equations 

[^,]Y, = P, i — 1,2, ... ,r (14.24) 

where the stiffness matrix, [Kj], is determined at the design vector X, . If we consider a 
new design vector, X N , in the neighborhood of the basic design vectors, the equilibrium 
equations at X n can be expressed as 

[K n ]Y n = P (14.25) 

where [ K,\\ is the stiffness matrix evaluated at X V - By approximating Y iV as a linear 
combination of the basic displacement vectors Y,-, i = 1, 2, . . . , r, we have 

Y n ~ C]Y 1 + C 2 Y 2 + ■ ■ ■ + c r Y = |Y]C (14.26) 

where [Y] = [Yi, Y 2 , ■ • ■ , Y r ] is an n x r matrix and C = {ci, C 2 , ■ • • , cy} T is an 
r-component column vector. Substitution of Eq. (14.26) into Eq. (14.25) gives 


[MF]C = P 


(14.27) 
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By premultiplying Eq. (14.27) by [F] T we obtain 

[K] C = P 

rxr rx 1 rx 1 

where 

m = tn T [^][n 

P = [T] r P 

It can be seen that an approximate displacement vector Y N can be obtained by solv- 
ing a smaller (r) system of equations, Eq. (14.28), instead of computing the exact 
solution Y N by solving a larger (n) system of equations, Eq. (14.25). The foregoing 
method is equivalent to applying the Ritz-Galerkin principle in the subspace spanned 
by the set of vectors Yi, Y 2 , . . . , Y, . The assumed modes Y, , i = 1,2, ... ,r, can be 
considered to be good basis vectors since they are the solutions of similar sets of 
equations. 

Fox and Miura 14.3 applied this method for the analysis of a 124-member, 
96-degree-of-freedom space truss (shown in Fig. 14.3). By using a 5-degree-of-freedom 
approximation, they observed that the solution of Eq. (14.28) required 0.653 s while 
the solution of Eq. (14.25) required 5.454 s without exceeding 1% error in the 
maximum displacement components of the structure. 


(14.28) 

(14.29) 

(14.30) 


14.4 DERIVATIVES OF STATIC DISPLACE M E NTS AND STRESSES 

The gradient-based optimization methods require the gradients of the objective and 
constraint functions. Thus the partial derivatives of the response quantities with respect 
to the design variables are required. Many practical applications require a Unite-element 



Figure 14.3 Space truss [13.3]. 
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analysis for computing the values of the objective function and/or constraint functions 
at any design vector. Since the objective and/or constraint functions are to be evaluated 
at a large number of trial design vectors during optimization, the computation of the 
derivatives of the response quantities requires substantial computational effort. It is 
possible to derive approximate expressions for the response quantities. The derivatives 
of static displacements, stresses, eigenvalues, eigenvectors, and transient response of 
structural and mechanical systems are presented in this and the following two sections. 
The equilibrium equations of a machine or structure can be expressed as 

[K]Y — P (14.31) 


where [ K \ is the stiffness matrix, Y the displacement vector, and P the load vector. 
By differentiating Eq. (14.31) with respect to the design variable x,-, we obtain 


9[/n 3Y 9P 

^ -Y + m — = — 

dxj dxj dxi 


(14.32) 


where 3[K j/9x, denotes the matrix formed by differentiating the elements of [ K j with 
respect to x, . Usually, the matrix is computed using a Unite-difference scheme as 

w ^ Am = m new — (14 33) 

dxj Ax, A Xi 

where [ | new is the stiffness matrix evaluated at the perturbed design vector X + AX,, 
where the vector AX, contains Ax,- in the / th location and zero everywhere else: 

AX,- = {0 0 ... 0 Ax,- 0 ... 0} T (14.34) 


In most cases the load vector P is either independent of the design variables or a 
known function of the design variables, and hence the derivatives, 9P/9x,-, can be 
evaluated with no difficulty. Equations (14.32) can be solved to find the derivatives of 
the displacements as 


9Y 

dXj 


= m- 1 


( 9P 


\dxj 



(14.35) 


Since [ // p 1 or its equivalent is available from the solution of Eqs. (14.31), Eqs. (14.35) 
can readily be solved to find the derivatives of static displacements with respect to the 
design variables. 

The stresses in a machine or structure (in a particular finite element) can be deter- 
mined using the relation 


a = [*]Y 


(14.36) 


where [ R\ denotes the matrix that relates stresses to nodal displacements. The deriva- 
tives of stresses can then be computed as 


da 9Y 

— = [■ R ] — 

dxj dxj 


(14.37) 


where the matrix [/?] is usually independent of the design variables and the vector 
9Y /dxj is given by Eq. (14.35). 
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14.5 DERIVATIVES OF EIGENVALUES AND EIGENVECTORS 

Let the eigenvalue problem be given by [14.4, 14.6, 14.10] 


[/T] Y = A [AT] Y (14.38) 

mxm m x 1 mxm mx 1 

where A is the eigenvalue, Y the eigenvector, [ K \ the stiffness matrix, and [M] the 
mass matrix corresponding to the design vector X = {x\, X2, ■ • • , x„ } T . Let the solution 
of Eq. (14.38) be given by the eigenvalues A,- and the eigenvectors Y, , i — 1,2 , ,m: 


L ft JY , = 0 

where [ft] is a symmetric matrix given by 


[Pi] = [K]-Xi[M] 


(14.39) 

(14.40) 


14.5.1 Derivatives of A, 

Premultiplication of Eq. (14.39) by Y j gives 

Y?[ft]Y j = 0 (14.41) 


Differentiation of Eq. (14.41) with respect to the design variable xj gives 

^Yi+yJiPiVfij =o 


YT.[7>.]Y, +Yj d[Pi] ' f ' vTr 


9 xj 

where Y ,-j = 9Y j/dxj. In view of Eq. (14.39), Eq. (14.42) reduces to 


y 7-3 [ft] 

'' dx ; 


Y, =0 


Differentiation of Eq. (14.40) gives 

3 [ft] 3m . 3 [M] 3A i 
= A; [M] 

3 Xj d Xj 3 Xj dxj 


(14.42) 


(14.43) 


(14.44) 


where 9[A']/9x J and 9[M]/9x ; - denote the matrices formed by differentiating the ele- 
ments of [ K | and [M\ matrices, respectively, with respect to xj. If the eigenvalues are 
normalized with respect to the mass matrix, we have [14.10] 

YT[M]Y; = 1 (14.45) 


Substituting Eq. (14.44) into Eq. (14.43) and using Eq. (14.45) gives the derivative of 
A, with respect to Xj as 


9A j 

3 Xj 


3m , 3 [M] 


dxj 


dx 


J J 


Y, 


(14.46) 


It can be noted that Eq. (14.46) involves only the eigenvalue and eigenvector under 
consideration and hence the complete solution of the eigenvalue problem is not required 
to find the value of 9A i/dxj. 
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14.5.2 Derivatives of Y, 

The differentiation of Eqs. (14.39) and (14.45) with respect to xj results in 

[r,]?h = -?Py, 


3 X; 


dxj 


T 3Y ,■ „r9[M], 

2Y J[M ] — - = -Y f — — - 

' dXi ' dx ; 


(14.47) 


(14.48) 


where 3[P,]/3xj is given by Eq. (14.44). Equations (14.47) and (14.48) can be shown 
to be linearly independent and can be written together as 

3[Pi\ 


' [Pi] ' 

2Yf[M] 


3Y L 

dx i 


Y 


(m+l)xm mxl 

By premultiplying Eq. (14.49) by 

' I PI l 7 


3 Xj 

yr PI 

■ ' Bxj . 

(m-\-l)xm mxl 


(14.49) 


= [LPJ [Af]Y ;] 


we obtain 


9Y,- 


[[P][P] + 2[M]Y,Y/[M]] — i- = - 

dxj 

mxm mxl 


9[p] „„ t 9[M] 

[Pil^ + tW/YT-i-^ 

dxj dxj 


Y, 

m x 1 

(14.50) 


The solution of Eq. (14.50) gives the desired expression for the derivative of the 
eigenvector, 9Y,/3 xj, as 

9Y, 


dXj 


= — [[Pi][P/] + 2[M]Y , Y J [M]] _1 


LPJ 


9LPJ 


[M] Y,Y 


3[M] 

dx: 


Y 


(14.51) 


dxj ux-j 

Again it can be seen that only the eigenvalue and eigenvector under consideration are 
involved in the evaluation of the derivatives of eigenvectors. 


s' 

k. J 

ik 

T T t 

Xi • x 2 • *3 

1 1 1 

▲ 

1 —* 

=3 

yr 

4 1 in. ► 

4 1 in. ►! 


x\ = 0.25" (i =1, 2, 3), p = 0.283 lb/in 3 , 
E = 30 x 10 6 psi 


Figure 14.4 Cylindrical cantilever beam. 
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Table 14.3 Derivatives of Eigenvalues [14.4J 


i 

Eigenvalue, X t 

a 9 Xi 

10- 9 — 

dx\ 

io- 9 ^i 

10-^i 

9xi 

10- 2 — 

9xi 

1 

24.66 

0.3209 

-0.1582 

1.478 

-2.298 

2 

974.7 

3.86 

-0.4144 

0.057 

-3.046 

3 

7782.0 

23.5 

21.67 

0.335 

-5.307 


For illustration, a cylindrical cantilever beam is considered [14.4]. The beam is 
modeled with three finite elements with six degrees of freedom as indicated in Fig. 14.4. 
The diameters of the beam are considered as the design variables, x,- , i — 1, 2, 3. The 
first three eigenvalues and their derivatives are shown in Table 14.3 [14.4]. 


14.6 DERIVATIVES OF TRANSIENT RESPONSE 

The equations of motion of an n -degree -of-freedom mechanical/structural system with 
viscous damping can be expressed as [14.10] 

[M]Y + [C]Y + [,K]Y = F(t) (14.52) 


where [M], [C], and [ K \ are the n x n mass, damping, and stiffness matrices, respec- 
tively, F {t) is the ti -component force vector, Y is the n -component displacement vector, 
and a dot over a symbol indicates differentiation with respect to time. Equations 
(14.52) denote a set of n coupled second-order differential equations. In most practical 
cases, n will be very large and Eqs. (14.52) are stiff; hence the numerical solution of 
Eqs. (14.52) will be tedious and produces an accurate solution only for low-frequency 
components. To reduce the size of the problem, the displacement solution, Y, is 
expressed in terms of r basis functions 4>i, $ 2 , ■ • •> and 4> ( (with r n) as 

r 

Y=[<F]q or y } = ^ jk q k {t), j = l,2,...,n (14.53) 

*=1 


where 


[<F] = [*! <t > 2 ••• * r ] 


is the matrix of basis functions, <$> jk the element in row j and column k of the matrix 
[ <f> j, q an r-component vector of reduced coordinates, and cjk(t) the /dh component 
of the vector q. By substituting Eq. (14.53) into Eq. (14.52) and premultiplying the 
resulting equation by [ T> | T , we obtain a system of r differential equations: 


where 


mq + [C]q + [T]q = F(0 

(14.54) 

[M] = [4>] T [M][4)] 

(14.55) 

[C] = [4>] T [C][4>] 

(14.56) 

m = [4>] T [^][4>] 

(14.57) 

F(0 = [0] T F(t) 

(14.58) 
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Note that if the undamped natural modes of vibration are used as basis functions and if 
[C] is assumed to be a linear combination of [M] and [ K \ (called proportional damp- 
ing), Eqs. (14.54) represent a set of r uncoupled second-order differential equations 
which can be solved independently [14.10]. Once q(r) is found, the displacement solu- 
tion Y (f) can be determined from Eq. (14.53). 

In the formulation of optimization problems with restrictions on the dynamic 
response, the constraints are placed on selected displacement components as 

IXj(X,f)l <Vmax, j = 1,2, ... (14.59) 


where yj is the displacement at location j on the machine/structure and y max is the 
maximum permissible value of the displacement. Constraints on dynamic stresses are 
also stated in a similar manner. Since Eq. (14.59) is a parametric constraint in terms 
of the parameter time ( t ), it is satisfied only at a set of peak or critical values of _y ; - 
for computational simplicity. Once Eq. (14.59) is satisfied at the critical points, it will 
be satisfied (most likely) at all other values of t as well [14.11, 14.12]. The values 
of yi at which dyj /dt = 0 or the values of y t at the end of the time interval denote 
local maxima and hence are to be considered as candidate critical points. Among the 
several candidate critical points, only a select number are considered for simplifying 
the computations. For example, in the response shown in Fig. 14.5, peaks a, b,c, ..., j 
qualify as candidate critical points. However, peaks a,b, /, and j can be discarded 
as their magnitudes are considerably smaller (less than, for example, 25%) than those 
of other peaks. Noting that peaks d and e (or g and h) represent essentially a single 
large peak with high-frequency undulations, we can discard peak e (or g), which has 
a slightly smaller magnitude than d (or h). Thus finally, only peaks c, d, h, and i need 
to be considered to satisfy the constraint, Eq. (14.59). 

Once the critical points are identified at a reference design X , the sensitivity of the 
response, yj (X , t ) with respect to the design variable x, at the critical point t — t c can 
be found using the total derivative of _y ; as 


dyj(X,t) _ 8yj ^ dyj dt c 
dxi dxj dt dxj ’ 


(14.60) 


The second term on the right-hand side of Eq. (14.60) is always zero since dyj/dt — 0 
at an interior peak (0 < t c < / max ) and dt c /dxj — 0 at the boundary (t c — t max ). The 


yjU) 



Figure 14.5 Critical points in a typical transient response. 
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derivative, dyj/dxi, can be computed using Eq. (14.53) as 


dyj_ 

dXj 




dqk(t) 
3 Xj 


i — 1,2, ... ,n 


(14.61) 


where, for simplicity, the elements of the matrix [4>] are assumed to be constants 
(independent of the design vector X). Note that for higher accuracy, the derivatives 
of <t>jk with respect to Xj (sensitivity of eigenvectors, if the mode shapes are used as 
the basis vectors) obtained from an equation similar to Eq. (14.51) can be included in 
Ending dyj/dxj. 

To find the values of dq^/dxi required in Eq. (14.61), Eq. (14.54) is differentiated 
with respect to Xj to obtain 


— 3q — 3d — 3q 

OX{ OXi OX( 


3F 3 [M].. 3[C] . d[K] 

— Q — CJ — CJ 

dXj dXj 3 Xj 3 Xj 


i = 1,2, ... ,n (14.62) 


The derivatives of the matrices appearing on the right-hand side of Eq. (14.62) can be 
computed using formulas such as 


3 [M] 
3 Xj 


= [<h] T 


3 [M] 

3 Xj 


[«*>] 


(14.63) 


where, for simplicity, [ <t>] is assumed to be constant and 3[M]/3x,- is computed using 
a finite-difference scheme. In most cases the forcing function F (t) will be known to 
be independent of X or an explicit function of X. Hence the quantity 3F/3 jc,- can be 
evaluated without much difficulty. Once the right-hand side is known, Eqs. (14.62) can 
be integrated numerically in time to End the values of 3q/3x, , 3q/3x, , and Uq/dx,. 
Using the values of 9q/3jc, = {'dqk/'dx , } at the critical point t c , the required sensitivity 
of transient response can be found from Eq. (14.61). 


14.7 SENSITIVITY OF OPTIMUM SOLUTION TO PROBLEM 
PARAMETERS 

Any optimum design problem involves a design vector and a set of problem param- 
eters (or preassigned parameters). In many cases, we would be interested in knowing 
the sensitivities or derivatives of the optimum design (design variables and objective 
function) with respect to the problem parameters [14.25, 14.26]. As an example, con- 
sider the minimum weight design of a machine component or structure subject to a 
constraint on the induced stress. After solving the problem, we may like to find the 
effect of changing the material. This means that we would like to know the changes 
in the optimal dimensions and the minimum weight of the component or structure due 
to a change in the value of the permissible stress. Usually, the sensitivity derivatives 
are found by using a finite-difference method. But this requires a costly reoptimization 
of the problem using incremented values of the parameters. Hence, it is desirable to 
derive expressions for the sensitivity derivatives from appropriate equations. In this 
section we discuss two approaches: one based on the Kuhn-Tucker conditions and the 
other based on the concept of feasible direction. 
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14.7.1 Sensitivity Equations Using Kuhn -Tucker Conditions 

The Kuhn-Tucker conditions satisfied at the constrained optimum design X* are given 
by [see Eqs. (2.73) and (2.74)] 


V'X) + Elj %A =0 , 


dxj 


jeJi 


dXi 
8j(*) = 0, 

^■>o. 


i = 1,2, ... ,n 


j e J 1 
j e Ji 


(14.64) 

(14.65) 

(14.66) 


where 7] is the set of active constraints and Eqs. (14.64) to (14.66) are valid with 
X = X* and kj = /.*. When a problem parameter changes by a small amount, we 
assume that Eqs. (14.64) to (14.66) remain valid. Treating /, gj, X, and kj as functions 
of a typical problem parameter p, differentiation of Eqs. (14.64) and (14.65) with 
respect to p leads to 


n 


E 


aV(X) | , ayx) 

dXjdXk ^ ' dXjdXk 

jeJ i 


dx k y. dkj dgj(X) d 2 f (X) 

dp dp dXi dXidp 

jeJi 


j d 2 gj(X) 

4? j dxidp 


= 0 , 


i — 1,2 , ... ,n 


(14.67) 


3g,(X) dgj(X) dxj 


j e ii 


(14.68) 


Equations (14.67) and (14.68) can be expressed in matrix form as 


'[ P ] 

_[fi] 


nxn 

T 

q xn 


[Q]nxq 
[ 0 ] ?x? 



ax 





3Pnxl 

' + 

[ a„ x i 

lb,xi. 

| x 1 

Kxi. 

- 1 

dpqx 1 



(14.69) 


where q denotes the number of active constraints and the elements of the matrices and 
vectors in Eq. (14.69) are given by 


d 2 f (X) | d 2 gj(X) 

dXidXk ^ ; dXidXk 
J^J 1 


Clj 


Qu 


3g;(X) 

dXi 


d 2 f (X) 

dxjdp 


jeh 


j e /i 

9g;(X) 

dxjdp 


agf(X) 

' 9p 


(14.70) 

(14.71) 

(14.72) 


j e 7i 


(14.73) 
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dX 

dx\ ' 
dp 

dX 

■ dx± ' 

dp 

dp 

dx n 

dp 

dXq 


dp 


dp 


(14.74) 


The following can be noted in Eqs. (14.69): 


1. Equations (14.69) denote (n + q) simultaneous equations in terms of the 
required sensitivity derivatives, dxt/dp (i — 1 . 2, .... n) and dXj/dp ( j — 
1.2 , . . . ,q). Both X* and X* are assumed to be known in Eqs. (14.69). If X* 
are not computed during the optimization process, they can be computed using 
Eq. (7.263). 

2. Equations (14.69) can be solved only if the system is nonsingular. One of the 
requirements for this is that the active constraints be independent. 

3 . Second derivatives of / and gj are required in computing the elements of [P] 
and a. 

4. If sensitivity derivatives are required with respect to several problem parameters 
pi, p 2 , ■ ■ ■ , only the vectors a and b need to be computed for each case and the 
system of Eqs. (14.69) can be solved efficiently using the techniques of solving 
simultaneous equations with different right-hand-side vectors. 


Once Eqs. (14.69) are solved, the sensitivity of optimum objective value with respect 
to p can be computed as 


df(X) 3/ (X) | y, df (X) dXj 

dp dp " 3 Xj dp 


(14.75) 


The changes in the optimum values of x, and / necessary to satisfy the Kuhn-Tucker 
conditions due to a change A p in the problem parameter can be estimated as 

3 Xj df 

A Xj = — A p, Af = -j-Ap (14.76) 

dp dp 

The changes in the values of Lagrange multiplier Xj due to A p can be estimated as 

3 Xj 

A Xj = -^-Ap (14.77) 

Equation (14.77) can be used to determine whether an originally active constraint 
becomes inactive due to the change, A p. Since the value of 7, ; is zero for an inactive 
constraint, we have 


3 Xj 

Xj -{- A Xj — Xj -p Ap — 0 

dp 


(14.78) 


from which the value of Ap necessary to make the j th constraint inactive can be 
found as 


Ap — — 


3 Xj/dp’ 


j e J\ 


(14.79) 
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Similarly, a currently inactive constraint will become critical due to A p if the new 
value of gj becomes zero: 

Sl » ) + ‘^A p = gl m + (±^yp (14.80) 

Thus the change A p necessary to make an inactive constraint active can be 
found as 


ip= _^h_ 

a Sj 3xj 
dXf dp 


(14.81) 


14.7.2 Sensitivity Equations Using the Concept of Feasible Direction 

Here we treat the problem parameter p as a design variable so that the new design 
vector becomes 


X = {.ri xj ■■■ x„ pj T (14.82) 

As in the case of the method of feasible directions (see Section 7.7), we formulate the 
direction finding problem as 

Find X which minimizes — S T V/(X) 


subject to 

S T Vg / <0, je J, 

S T S < 1 (14.83) 

where the gradients of / and gj ( j e J \ ) can be evaluated in the usual manner. The set 
J i can include nearly active constraints also (along with the active constraints) so that 
we do not violate any constraint due to the change, A p. The solution of the problem 
stated in Eqs. (14.83) gives a usable feasible search direction, S. A new design vector 
along S can be expressed as 

X n ew = X current + AS = X curTent + AX (14.84) 

where A is the step length and the components of S can be considered as 

Si = 

so that 


dxj 
9A ’ 
3 P 

3A ’ 


i — 1.2 , ,n 
i — n + 1 


(14.85) 


A P 


A p — A s, 


(14.86) 
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If the vector S is normalized by dividing its components by s n+ \ , Eq. (14.86) gives 
k — A p and hence Eq. (14.85) gives the desired sensitivity derivatives as 

= — S (14.87) 

Sn + 1 


Thus the sensitivity of the objective function with respect to p can be computed as 

df(X) „ T S 

' = V/(X) T (14.88) 

dp s n+ \ 

Note that unlike the previous method, this method does not require the values of k* 
and the second derivatives of / and gj to find the sensitivity derivatives. Also, if 
sensitivities with respect to several problem parameters pi, pi, ■ ■ ■ are required, all we 
need to do is to add them to the design vector X in Eq. (14.82). 


~dp 


dXn_ 

dp 


14.8 MULTILEVEL OPTIMIZATION 
14.8.1 Basic Idea 

The design of practical systems involving a large number of elements or subsys- 
tems with multiple-load conditions involves excessive number of design variables and 
constraints. The optimization problem becomes unmanageably large, and the solution 
process becomes too costly and can easily saturate even the largest computers avail- 
able. In such cases the optimization problem can be broken into a series of smaller 
problems using different strategies. The multilevel optimization is a decomposition 
technique in which the problem is reformulated as several smaller subproblems (one 
for each subsystem) and a coordination problem (at system level) to preserve the cou- 
pling among the subproblems (subsystems). Such approaches have been used in linear 
and dynamic programming also. In linear programming, the decomposition method (see 
Section 4.4) involves a number of independent linear subproblems coupled by limita- 
tions on the shared resources. When an individual subsystem is solved, the cost of the 
shared resources is added to its objective function. By a proper variation of the costs 
of the shared resources, the proposed optimal strategies of the various subproblems 
are sent to the master program, which, in turn, is optimized so that the overall cost is 
minimized. In dynamic programming, the problem is treated in stages with an optimal 
policy determined in each stage (see Chapter 9). This approach is particularly useful 
when the problem has a serial structure. 

For nonlinear design optimization problems, several decomposition methods 
have been proposed [14.14-14.16]. In the following section we consider a two-level 
approach in which the system is decomposed into a number of smaller subproblems, 
each with its own goals and constraints. The individual subsystem optimization 
problems are solved independently in the first level and the coordinated problem 
is solved in the second level. The approach is known as the model-coordination 
method. 
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14.8.2 M ethod 


Let the optimization problem be stated as follows: 


subject to 


Find X — {xi X 2 ■ ■ ■ x„ 

} T which minimizes /(X) 

(14.89) 

gj(X) <o, 

j = 1,2,.. 

. . , m 

(14.90) 

?r- 

>< 

II 

o 

k= 1,2,.. 


(14.91) 

S 

VI 

k 

VI 

w *r 

i = 1,2,.. 

. , n 

(14.92) 


where x- l] and x- l,) denote the lower and upper bounds on x, . Most systems permit the 
partitioning of the vector X into two subvectors Y and Z : 


X = 



(14.93) 


where the subvector Y denotes the coordination or interaction variables between the 
subsystems and the subvector Z indicates the variables confined to subsystems. The 
vector Z, in turn, can be partitioned as 


Z = 


Zi 

z k 

Zk 


(14.94) 


where Z k represents the variables associated with the /cth subsystem only and K denotes 
the number of subsystems. The partitioning of variables, Eq. (14.94), permits us to 
regroup the constraints as 


Si(X)' 
^2 (X) 


g (1) (Y,zo ' 
g (2) (Y,z 2 ) 

,gm(X) 


g iK) Cf,Z K ) 

/l(X) 

/ 2 (X) 


' / (1) (Y,Z0 ' 
/ (2) (Y,Z 2 ) 

/p(X) 


Z (Jf) (Y, Z K ) 


(14.95) 


(14.96) 


where the variables Y may appear in all the functions while the variables Z k appear 
only in the constraint sets g (A) < 0 and h 1 k 1 =0. The bounds on the variables, 
Eq. (14.92), can be expressed as 
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y (0 < Y < Y *"- 1 

Zf < Z* < Z[“\ k = l,2,...,K (14.97) 

Similarly, the objective function /(X) can be expressed as 

K 

/(X) = ]T/M( Y,Z t ) (14.98) 

fc=i 

where /®(Y, Z ^) denotes the contribution of the Ath subsystem to the overall objective 
function. Using Eqs. (14.95) to (14.98), the two-level approach can be stated as follows. 

First-level Problem. Tentatively fix the values of Y at Y* so that the problem of 
Eqs. (14.89) to (14.92) [or Eqs. (14.95) to (14.98)] can be restated (decomposed) as K 
independent optimization problems as follows: 

Find Z k which minimizes /' ■ / 1 (Y , Z^) 


subject to 

g (k \ Y.Z,)<0 

h®(Y,Z*) = 0 (14.99) 

zf < z, < Z< B) ; k= 1,2, ...,K 

It can be seen that the first-level problem seeks to find the minimum of the function 

K 

/(Y,Z) = £/<»(Y,Z*) (14.100) 

k= 1 

for the (tentatively) fixed vector Y*. 


Second-level Problem. The following problem is solved in this stage: 

K 

Find a new Y* which minimizes /( Y) = ^ /®(Y, Z* k ) 

k= 1 


subject to 


Y (0 < Y < Y ( “^ 


(14.101) 


where Z^, k — 1, 2, . . . , K, are the optimal solutions of the first-level problems. An 
additional constraint to ensure a finite value of /(Y*) is also to be included while 
solving the problem of Eqs. (14.101). Once the problem is solved and a new Y* found, 
we proceed to solve the first-level problems. This process is to be continued until 
convergence is achieved. The iterative process can be summarized as follows: 

1. Start with an initial coordination vector, Y*. 

2 . Solve the K first-level optimization problems, stated in Eqs. (14.99), and find 
the optimal vectors Z^(/< = 1,2,..., K). 
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3. Solve the second-level optimization problem stated in Eqs. (14.101) and find a 
new vector Y*. 

4. Check for the convergence of /* and Y* (compared to the value Y* used 
earlier). 

5. If the process has not converged, go to step 2 and repeat the process until 
convergence. 

The following example illustrates the procedure. 

E xample 14.2 Find the minimum- weight design of the two-bar truss shown in Fig. 14.6 
with constraints on the depth of the truss (y = h), cross-sectional areas of the members 
(zi = A i ) and (z 2 = At), and the stresses induced in the bars. Treat the depth of the 
truss (y) and the cross-sectional areas of bars 1 and 2 (z\ and zi) as design variables. 
The permissible stress in each bar is cro = 10 s Pa, unit weight is 76,500 N/m 3 , h is 
constrained as 1 m < /i < 6 m, and the cross-sectional area of each bar is restricted to 
lie between 0 and 0.1 m 2 . 


SOLUTION The stresses induced in the bars can be expressed as 

Pjy 2 + 36 6 /V.y 2 + 1 

( 7 1 = , < 7 ? = 

7yzi lyzi 

and hence the optimization problem can be stated as follows: 

Find X = [y z\ Z 2 V which minimizes 

/(X) = 76,500zi 77+36 + 76, 500z 2 /y 2 + 1 

subject to 

f+E±36_ 1<0 , Wptt _, <0 

7o- 0 jzi Tcr Q yz 2 

1 < y < 6, 0 < zi < 0.1, 0 < z 2 < 0.1 



Figure 14.6 Two-bar truss. 
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We treat the bars 1 and 2 as subsystems 1 and 2, respectively, with y as the coordination 
variable (Y = (y }) and zt and zi as the subsystem variables (Zj = {zi } and Z 2 = {* 2 }). 
By fixing the value of y at y*, we formulate the first-level problems as follows. 

Subproblem 1. 

Find zi which minimizes 

/ (1) (y *, zt) = 76,500*! V(y*) 2 + 36 (E0 


subject to 


gi(y*,Zi) 


(1428.5714 x 10- 6 )V(y *) 2 + 36 

y*z\ 

0 < *i < 0.1 


(E 2 ) 

(E 3 ) 


Subproblem 2. 

Find *2 which minimizes 

f (2 )(y*, Z 2 ) = 76,500*2 V(y*) 2 + l (E 4 ) 


subject to 


g 2 (y*, Z 2 ) 


(8571.4285 x lO " 6 ) v / (^*) 2 + 1 


< 0 


Tz 2 

0 < *2 < o.l 


(E 5 ) 

(E 6 ) 


We can see that to minimize f i[> we need to make zi as small as possible without 
violating the constraints of Eqs. (E 2 ) and (E 3 ). This gives the solution of subproblem 
1 , z* (which makes g\ active) as 

(1428.5714 x 10~V(v*) 2 + 36 

z = . V “ (E 7 ) 

-y* 

Similarly, the solution of subproblem 2, ** (which makes g 2 active) can be expressed 
as 

* (8571.4285 x 10“V (y*) 2 + 1 

*2 = ^ (Es) 

Now we state the second-level problem as follows: 


Find y which minimizes / = / ( 1 ) (y, **) + / ( 2 ) (y, *?) 


subject to 


1 < y < 6 


(E 9 ) 


Using Eqs. (E 7 ) and (Eg), this problem can be restated as (using y for y*): 
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Find v which minimizes 

/ = 76,500z* /y 2 + 36 + 76,500z? Jy 2 + 1 
V 2 + 36 y 2 + 1 

= 109.2857: + 655.7143- (E 10 ) 

y y 

subject to 

1 < y <6 and / must be defined 

The graph of /, given by Eq. (Em), is shown in Fig. 14.7 over the range 1 < y < 6 
from which the solution can be determined as /* = 3747.7 N, y* = h* — 2.45 m, z* — 
A\ = 3 .7790 x 10“ 3 m 2 , and z\= A\ — 9.2579 x 10“ 3 m 2 . 

14.9 PARALLEL PROCESSING 

Large-scale optimization problems can be solved efficiently using parallel computers. 
Parallel computers are simply multiple processing units combined in an organized 
fashion such that multiple independent computations for the same problem could be 
performed simultaneously or concurrently, thereby increasing the overall computational 
speed. Optimization problems involving extensive analysis, such as a finite-element 
analysis, can be solved on parallel computers using the following schemes: 

1. A multilevel (decomposition) approach with the subproblems solved in parallel 

2. A substructures approach with substructure analyses performed in parallel 

3. By implementing the optimization computations in parallel 



Figure 14.7 Graphical solution of the second-level problem. 
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If a multilevel (decomposition) approach is used, the optimization of various subsystems 
(at different levels) can be performed on parallel processors while the solution of the 
coordinating optimization problem can be accomplished on the main processor. If the 
optimization problem involves an extensive analysis, such as a Unite-element analysis, 
the problem can be decomposed into subsystems (substructures) and the analyses of 
subsystems can be conducted on parallel processors with a main processor performing 
the system-level computations. Such an approach was used by El-Sayed and Hsiung 
[14.17, 14.20]. The procedure can be summarized as follows: 

1. Initialize the optimization process. The current (related) design variables are 
sent to the various processors. 

2. The finite-element analyses of the substructures are performed on different 
(associated) processors. 

3. The main processor collects the stiffness and force contribution matrices from 
the various processors, solves for the displacements at the shared (common) 
boundary nodes of substructures, and sends the data to various processors. 

4. The associated processors perform the detailed calculations to find the displace- 
ments and stresses needed for the evaluation of the constraints. 

5. The main processor collects the constraint-related data from the associate pro- 
cessors and checks the convergence of the optimization process. If convergence 
is not achieved, it performs the computations of the optimization algorithm and 
the procedure is repeated from step 1 onward. 

Numerical examples were solved on a Cray X-MP four-processor supercomputer 
[14.17]. For a 200-member planar truss, the weight was minimized with constraints on 
stresses using four substructures. It was reported [14.17] that the parallel computations 
required 10.585 s of CPU time, while the sequential computations required a CPU time 
of 13.518 s (with a speedup factor of 1.28) 

For most mechanical and structural problems, parallel computers with MIMD (mul- 
tiple instruction multiple data) architecture are better suited. Atiqullah and Rao [14.21] 
presented a procedure for the parallel implementation of the simulated annealing algo- 
rithm. In this method, certain design variables assigned to each processor perform the 
variable specific optimization. This information is later combined to complete one cycle 
of optimization. Since the entire (variable-specific) optimization process is repeated on 
each processor, all processors will be equally busy most of the time, except for any 
input/output done by the specific processors. Thus the “divide and conquer” strategy 
of optimization needs a “communicate and combine” process, which should be kept to 
a minimum. The detailed procedure is shown as a flow diagram in Fig. 14.8. 

The minimum- weight design of a 128-bar planar truss was considered with 
stress and buckling constraints. A speedup factor of 10.2569 was achieved using the 
eight-node configuration of an iPSC/860 computer. 


14.10 MULTIOBJECTIVE OPTIMIZATION 

A multiobjective optimization problem with inequality constraints can be stated as 
(equality constraints, if they exist, can also be included in the formulation of the 
problem) 
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Figure 14.8 Flow diagram of parallel simulated annealing on a single node. S ( '\ set of design 
variables assigned to node i ; node i = processor/. 


subject to 


Find X = 


Xl 

X2 


which minimizes j\ (X ). / 2 (X ) , . . . , /j (X ) 


gj(X) <0, j — 1,2, ... ,m 


( 14 . 102 ) 


( 14 . 103 ) 

( 14 . 104 ) 


where k denotes the number of objective functions to be minimized. Any or all of the 
functions /,(X) and g ; (X) may be nonlinear. The multiobjective optimization problem 
is also known as a vector minimization problem. 
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Figure 14.9 Pareto optimal solutions. 


In general, no solution vector X exists that minimizes all the k objective functions 
simultaneously. Hence, a new concept, known as the Pareto optimum solution, is used 
in multiobjective optimization problems. A feasible solution X is called Pareto optimal 
if there exists no other feasible solution Y such that /j(Y) < /'(X) for i = 1,2, ... ,k 
with fjCf) < /,(X) for at least one j. In other words, a feasible vector X is called 
Pareto optimal if there is no other feasible solution Y that would reduce some objective 
function without causing a simultaneous increase in at least one other objective function. 
For example, if the objective functions are given by f\ = (x — 3) 4 and f 2 = (x— 6) 2 , 
their graphs are shown in Fig. 14.9. For this problem, all the values of jc between 3 
and 6 (points on the line segment PQ) denote Pareto optimal solutions. 

Several methods have been developed for solving a multiobjective optimization 
problem. Some of these methods are briefly described in the following paragraphs. 
Most of these methods basically generate a set of Pareto optimal solutions and use 
some additional criterion or rule to select one particular Pareto optimal solution as the 
solution of the multiobjective optimization problem. 


In the utility function method, a utility function £/,•(/)) is defined for each objective 
depending on the importance of f t compared to the other objective functions. Then a 
total or overall utility function U is defined, for example, as 


The solution vector X* is then found by maximizing the total utility U subjected to the 
constraints g/(X) < 0, j — 1,2,..., m. A simple form of Eq. (14.105) is given by 


14.10.1 Utility Function Method 


k 



(14.105) 


k k 


u = Y, Ui = -I>‘/'( x ) 


i=i i=i 


(14.106) 
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where ??/,■ is a scalar weighting factor associated with the /'th objective function. This 
method [Eq.( 1 4. 1 06)] is also known as the weighting function method. 


14.10.2 Inverted Utility Function Method 

In the inverted utility function method, we invert each utility and try to minimize or 
reduce the total undesirability. Thus if £/,•(/*) denotes the utility function corresponding 
to the / th objective function, the total undesirability is obtained as 

u ~ 1 = 't u , rl = 't^: (14107) 

(=1 (=1 u ‘ 

The solution of the problem is found by minimizing U~ 1 subject to the constraints 

g;(X) < o, j = 1,2, . ,.,m. 


14.10.3 G lobal C riterion M ethod 

In the global criterion method the optimum solution X* is found by minimizing a 
preselected global criterion, F(X), such as the sum of the squares of the relative 
deviations of the individual objective functions from the feasible ideal solutions. Thus 
X* is found by minimizing 


subject to 


p/y, 4 |//(X,*)-/iQ<> 

(> - h 1 t*x;> 


g,(X)<0, j = 1,2, ... ,m 


(14.108) 


where p is a constant (an usual value of p is 2) and X* is the ideal solution for the 
/'th objective function. The solution X* is obtained by minimizing /,(X) subject to the 
constraints gj(X) < 0 , j — 1 , 2 ,...,/?/. 


14.10.4 Bounded Objective Function Method 

In the bounded objective function method, the minimum and the maximum acceptable 
achievement levels for each objective function j\ are specified as L (l) and U (l> , respec- 
tively, for / = 1,2, ... ,k. Then the optimum solution X* is found by minimizing the 
most important objective function, say, the rth one, as follows: 

Minimize f (X ) 


j — 1,2 , . . . ,m 
i = 1, 2, . . . , k, i ^ r 


subject to 


S;(X) <0, 

L 0) < fi < U (i) 


(14.109) 
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14.10.5 L exicographic M ethod 

In the lexicographic method, the objectives are ranked in order of importance by the 
designer. The optimum solutoin X* is then found by minimizing the objective functions 
starting with the most important and proceeding according to the order of importance 
of the objectives. Let the subscripts of the objectives indicate not only the objective 
function number, but also the priorities of the objectives. Thus /i(X) and /r (X) denote 
the most and least important objective functions, respectively. The first problem is 
formulated as 


Minimize /i(X) 

subject to (14.110) 

g/(X)<0, J = 1,2, ...,/M 

and its solution Xj and /* = /j (X * ) is obtained. Then the second problem is 
formulated as 

Minimize / 2 (X) 


subject to 


g/(X) < 0, 7 = 1,2,...,//? 

/i(X) = /r (14.111) 

The solution of this problem is obtained as Xj and f* = /t (X | ) . This procedure 
is repeated until all the k objectives have been considered. The ;'th problem is 
given by 

Minimize /, (X) 

subject to 

g;(X) <0, j = 1,2, ...,m 

/KX) = /,*, /= 1 , 2,...,/ -1 (14.112) 

and its solution is found as X* and f* — /,(X*). Finally, the solution obtained at 
the end (i.e., Xp is taken as the desired solution X* of the original multiobjective 
optimization problem. 

14.10.6 Goal Programming Method 

In the simplest version of goal programming, the designer sets goals for each objective 
that he or she wishes to attain. The optimum solution X * is then defined as the one that 
minimizes the deviations from the set goals. Thus the goal programming formulation 
of the multiobjective optimization problem leads to 

!>;+<*/ )' 

j = i 


Minimize 


p > 1 
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subject to 


gj(X)<0, j — 1,2, ... ,m 
+ dj~ - dj = bj, j — 1,2, ... ,k 

d+> 0, j = \, 2, ... ,k (14.113) 

dj > 0, j = 1,2, ..., k 
d+dj=0, j — 1,2, ... ,k 

where bj is the goal set by the designer for the / th objective and dj and dj are, 
respectively, the underachievement and overachievement of the /th goal. The value of 
p is based on the utility function chosen by the designer. Often the goal for the j'th 
objective, bj, is found by first solving the following problem: 

Minimize/y (X ) 

subject to (14.114) 

g,(X)<0, j — 1,2, ... ,m 


If the solution of the problem stated in Eq. (14.1 14) is denoted by X*, then bj is taken 
as bj - 


14.10.7 Goal Attainment Method 

In the goal attainment method, goals are set as bj for the objective function /(X), i — 
1, 2, . . . , k. In addition, a weight iv, > 0 is defined for the objective function /■ (X) to 
denote the importance of the r'th objective function relative to other objective functions 
in meeting the goal bj, i — 1,2 , ... ,k. Often the goal Ip is found by first solving the 
single objective optimization problem: 

Minimize /, (X ) 

subject to (14.115) 

gj(X) < 0; j = 1,2, ...,m 

If the solution of the problem stated in Eq. (14.115) is denoted X* then b, can be 
taken as the optimum value of the objective / , f* — f(X*). A scalar y is introduced 
as a design variable in addition to the n design variables x, , i — 1,2 , ... ,n. Then the 
following problem is solved: 

Find X\,X 2 , ■ ■ . , x n and y 

to minimize F{x\, xi, . . . , x n , y) = y 

subject to 


gj(X) < 0; j = 1,2 

fi(X) - ywi < by, i = 1, 2, . . . , k 


(14.116) 
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with the weights satisfying the normalization condition 

k 

J2 w ' = 1 

i = \ 

14.11 SOLUTION OF MULTIOBJECTIVE PROBLEMS USING 
MATLAB 

The MATLAB function fgoalattain can be used to solve a multiobjective optimiza- 
tion problem using the goal attainment method. The following example illustrates the 
procedure. 

Example 14.3 Find the solution of the following three-objective optimization problem 
using goal attainment method using the MATLAB function fgoalattain. 

Minimize 

fi = i(M-2) 2 + ife + l) 2 + 3 

fi = ns (At +x 2 - 3 ) 2 + yj(2x 2 - x\) 2 - 13 

/ 3 = g(3.ti - 2x 2 + 4 ) 2 + 27 CM - x 2 + l ) 2 + 15 

subject to 


- 4 < Xi < 4; i = 1, 2 

+ x 2 — 4 < 0 

— x\ — 1 <0 
x\ — x 2 — 2 < 0 

Assume the initial design variables to be x\ = x 2 = 0.1, the weights to be w\ — 0.2, 
w 2 = 0.5, and w 2 — 0.3, and the goals to be b\ — 5, b 2 — — 8 , and b 2 = 20. 

SOLUTION 

Step 1: Create an m-file for the objective functions and save it as fgoalat- 
tain_ob j .m 

function f = fgoalattainob j (x) 
f ( 1 ) = (x (1) -2) A 2/2+ (x (2) +1) A 2/13 + 3 
f (2 ) = (x (1) +x (2) -3) A 2/175+ (2*x (2) -x (1) ) A 2/17-13 
f (3) = (3*x (1) -2*x (2) +4) A 2/8+ (x (1) -x (2) +1) A 2/27 + 15 

Step 2: Create an m-file for the constraints and save it as fgoalattain_con.m 

function [c ceq] = fgoalattaincon (x) 
c— [- 4- x ( 1 ) ; ... 

x (1) - 4; ... 
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- 4- x(2) ; ... 

x (2) - 4; ... 

x (2) +4*x (1) - 4; ... 

- 1- x(l) ; ... 

x ( 1 ) — 2- x (2 ) ] 

ceq = [ ] ; 

Step 3: Ctreate an m-file for the main program and save it as fgoalat- 
tain_main .m 


clc; clear all; 
xO = [0.1 0.1] 
weight = [0.2 0.5 0.3] 
goal = [5 -8 20] 

x, fval, attainf actor, exitf lag] = fgoalattain (gfgoalattainobj, 
xO, goal, weight, [ ] , [ ] , [ ] , [ ] , [ ] , [ ] , gfgoalattaincon) 

Step 4: Run the program fgoalattain_main.m to obtain the following result: 


Initial design vector: 

Initial objective values: 
Constraints at initial design: 


Optimum design vector: 
Optimum objective values: 
Constraints at optimum design: 


0 . 1 , 0 . 1 

4.8981 -12.9546 17.1383 
- 4 . 1000 
-3.9000 
- 4 . 1000 
-3.9000 
-3.5000 
- 1 . 1000 
- 2.0000 

0.8308 0.6769 

3.8999 -12.9712 18.3498 
-4.8308 
-3.1692 
-4 . 6769 
-3.3231 
- 0.0000 
-1 . 8308 
-1 . 8462 
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REVIEW QUESTIONS 

14.1 What is a reduced basis technique? 

14.2 State two methods of reducing the size of an optimization problem. 

14.3 What is design variable linking? Can it always be used? 

14.4 Under what condition(s) is the convergence of the quantity E, AY, in the fast reanalysis 
method ensured? 

14.5 How do you compute the derivatives of the stiffness matrix with respect to a design 
variable, 3 [K]ldxil 

14.6 What is a MIMD computer? 

14.7 Indicate various ways by which parallel computations can be performed in a large-scale 
optimization problem. 

14.8 How are the goals determined in the goal programming method? 

14.9 Answer true or false: 

(a) The computation of the derivatives of a particular k,- requires other eigenvalues 
besides k,-. 

(b) The derivatives of the ith eigenvector can be found without knowledge of the eigen- 
vectors other than Y . 

(c) There is only one way to derive expressions for the sensitivity of optimal objective 
function with respect to problem parameters. 

(d) Multilevel optimization is same as decomposition. 

(e) In multilevel optimization, the suboptimization problems are to be solved iteratively. 

(f) All multiobjective optimization methods find only a Pareto optimum solution. 

(9) All multiobjective optimization techniques convert the problem into a single objec- 
tive problem. 

(h) A vector optimization problem is same as a multiobjective optimization problem. 

(i) Only one Pareto optimal solution exists for a multiobjective optimization problem. 

(j) The weighting function method can be considered as the utility function method. 

(k) It is possible to achieve the optimum value of each objective function simultaneously 
in a multiobjective optimization problem. 

14.10 Define the following terms: 

(a) Pareto optimal point 

(b) Utility function method 

(c) Weighting function method 

(d) Global criterion function method 
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(e) Bounded objective function method 

(f) Lexicographic method 


PROBLEMS 

14.1 Consider the minimum-volume design of the four-bar truss shown in Fig. 14.2 subject 
to a constraint on the vertical displacement of node 4. Let X| = {1, 1 , 0.5, 0.5} T and 
X2 = {0.5, 0.5, 1, 1} T be two design vectors, with x,- denoting the area of cross section 
of bar i (1 = 1, 2, 3, 4). By expressing the optimum design vectors as X = ciX 1 4- c 2X2, 
determine the values of c 1 and C2 through graphical optimization when the maximum 
permissible vertical deflection of node 4 is restricted to a magnitude of 0.1 in. 

14.2 Consider the configuration (shape) optimization of the 10-bar truss shown in Fig. 14.10. 
The (X, Y) coordinates of the nodes are to be varied while maintaining (a) symmetry 
of the structure about the X axis, and (b) alignment of nodes 1, 2, and 3 (4, 5, and 6). 
Identify the independent and dependent design variables and derive the relevant design 
variable linking relationships. 

14.3 For the four-bar truss considered in Example 14.1 (shown in Fig. 14.2), a base design 
vector is given by Xo = { A 1 , A2, A3, A 4} T = {2.0, 1.0, 2.0, 1.0} T in 2 . If AX is given by 
AX = {0.4, 0.4, —0.4, — 0.4} T in 2 , determine 

(a) The exact displacement vector Yq = {ys, y6, yi, ys } T at Xo 

(b) The exact displacement vector (Yo + AY) at (Xo + AX) 

(c) The displacement vector (Yo + AY) where AY is given by Eq. (14.20) with five 
terms 

14.4 Consider the 11-member truss shown in Fig. 5.1 with loads Q = —10001b, R = 10001b, 

and S = 20001b. If A; = x,- denotes the area of cross section of member i, and 
mi, U 2 wio indicate the displacement components of the nodes, the equilib- 

rium equations can be expressed as shown in Eqs. (Ei) to (E10) of Example 5.1. 
Assuming that E = 30 x 10 6 psi, / = 50 in., x,- = 1 in 2 (t = 1, 2, . . . , 11), Ax ; = 0.1 
in 2 (r = 1, 2, . . . , 5), and Ax,- = —0.1 in 2 ( / = 6, 7, 11), determine 



Figure 14.10 Design variable linking of a 10-bar truss. 
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(a) Exact displacement solution U () at Xo 

(b) Exact displacement solution (Uo + All) at the perturbed design, (Xo + AX) 

(c) Approximate displacement solution, (Uo + AU ), at (Xo + AXj using Eq. (14.20) 
with four terms for AU 

14.5 Consider the four-bar truss shown in Fig. 14.2 whose stiffness matrix is given by 
Eq. (Et) of Example 14.1. Determine the values of the derivatives of y, with respect 
to the area A i, dyi/dx\ (i = 5, 6, 7, 8 ) at the reference design Xo = {Aj A 2 A 3 A^} 1 = 
{ 2 . 0 , 2 . 0 , 1 . 0 , 1 . 0 } T in 2 . 

14.6 Find the values of dyi/dx 2 ( i = 5, 6 , 7, 8 ) in Problem 14.5. 

14.7 Find the values of dyi/dx^ (i = 5, 6 , 7, 8 ) in Problem 14.5. 

14.8 Find the values of dyi/dx^ (i = 5, 6 , 7, 8 ) in Problem 14.5. 

14.9 The equilibrium equations of the stepped bar shown in Fig. 14.11 are given by 


with 


IK] 


[X]Y = P 


A\E\ A2E2 
ME 2 

h 



A2E2 

h 

A2E2 

h 

Pi 

Pi 


(1) 

( 2 ) 

( 3 ) 


If A, = 2 in. 2 , A 2 = 1 in. 2 , E l = E 2 = 30 x 10 6 psi, 2 h = h = 50 in., Pi = 1001b, 
and P 2 = 200 lb, determine 

(a) Displacements, Y 

(b) Values of 9Y /9Ai and 3Y /3 A 2 using the method of Section 14.4 

(c) Values of da /dA\ and da/dA 2 , where a = {cri, a 2 } T denotes the vector of stresses 
in the bars and cri = E\Y\/l\ and 02 = E 2 (Y 2 — Y\)/l 2 



Figure 14.11 Stepped bar. 
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14.10 The eigenvalue problem for the stepped bar shown in Fig. 14.11 can be expressed as 
[AT JY = >.[M]Y with the mass matrix, [M], given by 


l M ] = 


( 2 piA[/| + P2A2I2) P2A2I2 
P2A2I2 P2A2I2, 


where p,-, A;, and Z,- denote the mass density, area of cross section, and length of the seg- 
ment Z, and the stiffness matrix, [AT], is given by Eq. (2) of Problem 14.9. If A[ =2 in 2 , 

At = 1 in 2 , E\ = E 2 = 30 x 10 6 psi, 2l\ = Z 2 = 50 in., and p\g = p 2 g = 0.283 lb/in 3 , 

determine 

(a) Eigenvalues /,,■ and the eigenvectors Y , . / = 1,2 

(b) Values of 3A.,-/3Ai, i = 1, 2, using the method of Section 14.5 

(c) Values of 3Y,/3Y 1 , i = 1, 2, using the method of Section 14.5 


14.11 For the stepped bar considered in Problem 14.10, determine the following using the 
method of Section 14.5. 

(a) Values of 3A.,-/3A 2 , * = 1,2 

(b) Values of 9Y,/3A 2 , * = 1,2 

14.12 A cantilever beam with a hollow circular section with outside diameter d and wall 
thickness 1 (Fig. 14.12) is modeled with one beam finite element. The resulting static 
equilibrium equations can be expressed as 


2 EI 

' 6 —3 r 

F'U! 

Pi 

— 

-3 1 2 1 2 _ 

U2 ) 

Pi 


where I is the area moment of intertia of the cross section, E is Young’s modulus, and 
/ the length. Determine the displacements, T,-, and the sensitivities of the deflections, 
dYi/dd and dYj/dt(i = 1, 2), for the following data: £ = 30 x 10 6 psi, / = 20 in ,,d = 2 
in., t = 0.1 in., Pi = 1001b, and P 2 = 0. 

14.13 The eigenvalues of the cantilever beam shown in Fig. 14.12 are governed by the equation 


2 El 

' 6 —3 r 


XpAl 

' 156 

—22 r 

f V, 1 

Z 3 

-31 2 1 2 _ 

[Yl] 

420 

-22 Z 

4 Z 2 _ 

U21 


y lA 

y 



Figure 14.12 Hollow circular cantilever beam. 
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r 

Y 2 



Figure 14.13 Two-degree-of-freedom spring-mass system. 


where E is Young’s modulus, I the area moment of inertia, / the length, p the mass 
density, A the cross-sectional area, X the eigenvalue, and Y = {Fj , F 2 } T = eigenvector. If 
E = 30 x 10 6 psi, d = 2 in., t = 0.1 in., I = 20 in., and pg = 0.283 lb/in 2 , determine 

(a) Eigenvalues Xi and eigenvectors Y ,■(/ = 1, 2) 

(b) Values of dXj/dd and dXj/ dt(i = 1,2) 

14.14 In Problem 14.13, determine the derivatives of the eigenvectors 9Y ; /9 d and 9Y,/9t 
(i = 1,2). 

14.15 The natural frequencies of the spring-mass system shown in Fig. 14.13 are given by 
(for ki = k, i = 1, 2, 3 and m, = m, i = 1,2) 


k 2 3& 2 

Ai = — = &>. , ^-2 = — — 

m m 



where a>\ and u > 2 are the natural frequencies of vibration of the system and c\ and c 2 
are constants. The stiffness of each helical spring is given by 

d A G 
k ~ W*n 

where d is the wire diameter, D the coil diameter, G the shear modulus, and n the 
number of turns of the spring. Determine the values of 9co,79D and 9Y ,/9D for the 
following data: d = 0.04 in., G = 11.5 x 10 6 psi, D = 0.4 in., n = 10, and m = 32.21b 
-s 2 /in. The stiffness and mass matrices of the system are given by 



' 2 -V 


'1 O' 

[K]=k 

-1 2 

, [ M ] = m 

_° 1 


14.16 Find the minimum volume design of the truss shown in Fig. 14.14 with constraints on 
the depth of the truss (y), cross-sectional areas of the bars (A 1 and A 2 ), and the stresses 
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induced in the bars (eri and ( 72 )- Treat y, A\, and A 2 as design variables with er,- < 10 5 Pa 
(i = 1, 2), 1 m < y < 4 m, and 0 < A; < 0.2 m 2 (i = 1, 2). Use multilevel optimization 
approach for the solution. 

14.17 Find the sensitivities of x*, x\, and /* with respect to Young’s modulus of the tubular 
column considered in Example 1.1. 

14.18 Consider the two-bar truss shown in Fig. 1.15. The problem of design of the truss for 
minimum weight subject to stress constraints can be stated as follows: 

Find x \ , A], and A 2 which minimize 

/ = 28.30A! 7l +x 2 + 14.15A 2 V l + x 2 

subject to 

0.1768(1 + x)Vl + x 2 

g 1 = 


0.1 <x <2.5, 1.0 < A* <2.5 (i = 1,2) 

where the members are assumed to be made up of different materials. Solve this opti- 
mization problem using the multilevel approach. 

14.19 Consider the design of the two-bar truss shown in Fig. 14.15 with the location of nodes 
1 and 2(x) and the area of cross section of bars (A) as design variables. If the weight 
and the displacement of node 3 are to be minimized with constraints on the stresses 
induced in the bars along with bounds on the design variables, the problem can be stated 
as follows [14.34J: 


A\x 

0.1768U - lW\+x 2 


Find X = [xiX 2 } r which minimizes 
/i(X) = 2phx2yJ 1 + xi 


h = 


Ph{ \+x 2 ) x ^\+x\ 


2\/2Ex 2 X2 
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subject to 


gi(X) 


P(1 +xi)^j\ + x\ 
l\[lx\X2 


- cr 0 < 0 


Pix 1 - \)J 1 +x\ 

g2(X) = —= CTO < 0 

2\/2x\X2 

Xi > jc®, i = 1,2 

where x\ = x/ h, X 2 = A/A re f, /j the depth, E is Young’s modulus, p the weight density, 
(To the permissible stress, and x ^ the lower bound on x,- . Find the optimum solutions of 
the individual objective functions subject to the stated constraints using a graphical pro- 
cedure. Data: P = 10,0001b, p = 0.283 lb/in 3 , E = 30 x 10 6 psi, h = 100 in., A re f = 1 
in. 2 , (To = 20,000 psi, .rj ,) = 0.1, and jt® = 1.0. 

14.20 Solve the two-objective optimization problem stated in Problem 14.19 using the weight- 
ing method with equal weights to the two objective functions. Use a graphical method 
of solution. 

14.21 Solve the two-objective optimization problem stated in Problem 14.19 using the global 
criterion method with p = 2. Use a graphical method of solution. 

14.22 Formulate the two-objective optimization problem stated in Problem 14.19 as a goal 
programming problem using the goals of 30 lb and 0.015 in. for the objectives f\ and 
/ 2 , respectively. Solve the problem using a graphical procedure. 

14.23 Consider the following two-objective optimization problem: 

Find X = [x\ X 2 x$ X 4 x$ xe} T 
to minimize 
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/i(X) = -25(xi - 2) 2 - (x 2 - 2) 2 - (jc 3 - l) 2 - (x 4 - 4) 2 - (x 5 - l) 2 
/2(X) = X\ + x\ + + X4 + X$ + 

subject to 

— x\ — X 2 + 2 < 0; x\ + X 2 — 6 < 0; —x\ + X 2 — 2 < 0; x\ — 3*2 — 2 < 0; 

(*3 — 3) 2 4- *4 — 4 < 0; — (X5 — 3) 2 — X6 + 4 < 0; 0 < Xj < 10, i = 1, 2, 6 

1 < Xi < 5, i = 3, 5; 0 < X 4 < 6 

Find the minima of the individual objective functions under the stated constraints using 
the MATLAB function fmincon. 

14.24 Find the solution of the two-objective optimization problem stated in Problem 14.23 
using the weighting function method with the weights w\ = W 2 = 1. Use the MATLAB 
function fmincon for the solution. 

14.25 Find the solution of the two-objective optimization problem stated in Problem 14.23 
using the global criterion method with p = 2. Use the MATLAB function fmincon for 
the solution. 

14.26 Find the solution of the two-objective optimization problem stated in Problem 14.23 
using the bounded objective function method. Take the lower and upper bounds on /2 
as 80 and 120% of the optimum value / 2 * found in Problem 13.23. Use the MATLAB 
function fmincon for the solution. 

14.27 Find the solution of the two-objective optimization problem stated in Problem 14.23 
using the goal attainment method. Use the MATLAB function fgoalattain for the 
solution. Use suitable goals for the objectives. 

14.28 Consider the following three-objective optimization problem: 

Find X = {xiX 2 } t to minimize 

/i(X) = 1.5 — jd(l — jc 2 ) 

/ 2 (X) = 2.25 — Jn(l - jc|) 

/ 3 (X) = 2.625 -.n(l -jc|) 

subject to 

-x\- (x 2 - 0.5) 2 + 9 < 0 
(x 1 - l) 2 + (x 2 - 0.5) 2 - 6.25 < 0 
— 10 < Xi < 10; i = l,2 

Find the minima of the individual objectives under the stated constraints using the 
MATLAB function fmincon. 

14.29 Find the solution of the 3-objective problem stated in Problem 14.28 using the weight- 
ing function method with the weights wi = wi = W 3 = 1. Use the MATLAB function 
fmincon for the solution. 

14.30 Find the solution of the multiobjective problem stated in Problem 14.28 using the goal 
attainment method. Use the MATLAB function fgoalattain for the solution. Use 
suitable goals for the objectives. 



Convex and Concave Functions 

Convex Function. A function /(X) is said to be convex if for any pair of points 


v on 


(2) 

*1 


*1 

m 


„( 2 ) 

x 2 

and X 2 = - 

x 2 

x (1) 

A n 


x m 


and all X, 0 < X < 1, 

f[XX 2 + (1 - X)X { ] < Xf(X 2 ) + (1 - X)f(X l ) (A.l) 

that is, if the segment joining the two points lies entirely above or on the graph of 
/(X). Figures A. la and A. 2 a illustrate a convex function in one and two dimensions, 
respectively. It can be seen that a convex function is always bending upward and 
hence it is apparent that the local minimum of a convex function is also a global 
minimum. 

Concave Function. A function /(X) is called a concave function if for any two 
points X] and X?, and for all 0 < X < 1, 

f[XX 2 + (1 - X)X l ] > Xf(X 2 ) + (1 - X)f(X l ) (A.2) 

that is, if the line segment joining the two points lies entirely below or on the graph 
of /(X). 

Figures A.l b and A.2 b give a concave function in one and two dimensions, respec- 
tively. It can be seen that a concave function bends downard and hence the local 
maximum will also be its global maximum. It can be seen that the negative of a con- 
vex function is a concave function, and vice versa. Also note that the sum of convex 
functions is a convex function and the sum of the concave functions is a concave 
function. A function /(X) is strictly convex or concave if the strict inequality holds 
in Eqs. (A.l) or (A.2) for any X] / X 2 . A linear function will be both convex and 
concave since it satisfies both inequalities (A.l) and (A.2). A function may be con- 
vex within a region and concave elsewhere. An example of such a function is shown 
in Fig. A. 3. 

Testing for C onvexity or C on cavity. In addition to the definition given, the following 

equivalent relations can be used to identify a convex function. 
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Figure A.l Functions of one variable: (a) convex function in one variable; ( b ) concave function 
in one variable. 


f(*l.*2) f(xi,x 2 ) 




Figure A.2 Functions of two variables: (a) convex function in two variables; (b) concave 
function in two variables. 


f(x) 



Figure A.3 Function that is convex over certain region and concave over certain other region. 
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Theorem A.l A function /(X) is convex if for any two points Xj and X 2 , we have 

/(X 2 ) > /(XO + V/ T (XO(X 2 - xo 


Proof : If /(X) is convex, we have by definition 

f[XX 2 + (1 - A)Xi] < Xf(X 2 ) + (1 - X)f(X l ) 

that is, 

/[X! + A(X 2 - Xi)] < /(XO + X[f(X 2 ) - /(XO] 
This inequality can be rewritten as 


/[Xi + k(X 2 -X])] - /(XO 

/(x 2 ) - /(Xi) > — - — — — — - 

J J - 1 k(X 2 - Xi) 


(X 2 - Xi) 


By defining AX = a(X? — XO, the inequality (A.4) can be written as 

/[X! + AX] - f(X j) 

/ (x 2 ) - /(XO > ^ J (X 2 - XO 
By taking the limit as AX — > 0, inequality (A.5) becomes 

/ (X 2 ) - /(XO > V/ r (X0(X 2 - XO 


(A. 3) 


(A.4) 


(A.5) 


(A. 6) 


which can be seen to be the desired result. If /(X) is concave, the opposite type of 
inequality holds true in (A.6). 


Theorem A.2 A function /(X) is convex if the Hessian matrix H(X) = [3 2 /(X)/ 
3 Xi 3 xj] is positive semidehnite. 


Proof : From Taylor’s theorem we have 


/(X* + h) = f(X*) + J2 


1=1 

n n 


dX; 




i = 1 J = 1 


9 2 / 

3 Xi dxj 


(A.l) 


X=X*+0h 


where 0 < 0 < 1. By letting X* = Xi, X* + h = X 2 and h = X? — Xi, Eq. (A.l) can 
be rewritten as 


/ (X 2 ) = /(XO + V/ T (X0(X 2 - XO + \(X 2 - X0 T 

xH{X!+0(X 2 -XO}(X 2 -XO (A. 8) 

It can be seen that inequality (A.6) is satisfied [and hence /(X) will be convex] if H(X) 
is positive semidehnite. Further, if H(X) is positive definite, the function /(X) will be 
strictly convex. It can also be proved that if /(X) is concave, the Hessian matrix is 
negative semidehnite. 
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The following theorem establishes a very important relation, namely, that any local 
minimum is a global minimum for a convex function. 

Theorem A.3 Any local minimum of a convex function /(X) is a global minimum. 

Proof : Let us prove this theorem by contradiction. Suppose that there exist two different 
local minima, say, X] and X2, for the function /(X). Let /(X2) < /(X 1). Since /(X) 
is convex, Xj and X2 have to satisfy the relation (A.6), that is, 

/ (X 2 ) - /(XO > V/ t (X!)(X 2 - XO (A.6) 


or 


V/ r (Xi)S <0 (A.9) 

where S = (X2 — Xi) is a vector joining the points X 1 to X2. Equation (A.9) indicates 
that the value of the function /(X) can be decreased further by moving in the direction 
S = (X2 — Xi) from point Xi. This conclusion contradicts the original assumption that 
Xi is a local minimum. Thus there cannot exist more than one minimum for a convex 
function. 


Example A. 1 Determine whether the following functions are convex or concave. 

(a) f(x) = e x 

(b) fix) = — 8x 2 

(c) fix 1, x 2 ) = 3x 2 - 6x\ 

(d) fix 1, X ] , X3) = 4x\ + 3xf + 5x 2 + 6x1X2 + X1X3 — 3xi — 2x2 + 15 

SOLUTION 

(a) fix) = e x : Hix) — d 2 f/dx 2 — e x > 0 for all real values of x. Hence fix) is 
strictly convex. 

(b) fix) = — 8x 2 : Hix) — d 2 f/dx 2 — — 16 < 0 for all real values of x. Hence 
fix) is strictly concave. 

(c) f =2xl - 6 x 2 : 


d 2 f/dx 2 

3 2 //3xi 3x2 


12xi 0 

3 2 f/dxi 9x2 

d 2 f/dx 2 


0 -12 


Here 3 2 //3x 2 = 12xi < 0 for xi <0 and > 0 for xi > 0, and 

I H(X) I = — 144.\'i >0 for xi < 0 and <0 for xi > 0 


Hence H(X) will be negative semidefinite and /(X) is concave for xi < 0. 
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(d) / = 4x 2 + 3x^ + 5x| + 6 x 1 x 2 + X 1 X 3 — 3xi — 2 x 2 + 15: 


H(X) = 


d 2 f/dx 2 3 2 f/dx\ 3x2 9 2 // 3 xi 8x3 
3 2 // 3 xi 3x2 3 2 // dx 2 3 2 //9x2 3x3 
3 2 //3xi3x3 3 2 // 3x2 8 x 3 3 2 //3x 2 


8 6 r 
6 6 0 
1 0 10 


Here the principal minors are given by 


| 8 | = 8 >0 


8 6 
6 6 


12 >0 


8 6 1 
6 6 0 
1 0 10 


114 >0 


and hence the matrix H(X) is positive definite for all real values of xi, X 2 , and 
X3. Therefore, /(X) is a strictly convex function. 


Some Computational Aspects 
of Optimization 



Several methods were presented for solving different types of optimization problems 
in Chapters 3 to 14. This appendix is intended to give some guidance to the reader in 
choosing a suitable method for solving a particular problem along with some computa- 
tional details. Most of the discussion is aimed at the solution of nonlinear programming 
problems. 


B.l CHOICE OF METHOD 

Several factors are to be considered in deciding a particular method to solve a given 
optimization problem. Some of them are 

1. The type of problem to be solved (general nonlinear programming problem, 
geometric programming problem, etc.) 

2. The availability of a ready-made computer program 

3. The calender time required for the development of a program 

4. The necessity of derivatives of the functions/ and gj, j — 1,2 , ,m 

5. The available knowledge about the efficiency of the method 

6 . The accuracy of the solution desired 

7. The programming language and quality of coding desired 

8 . The robustness and dependability of the method in finding the true optimum 
solution 

9. The generality of the program for solving other problems 

10. The ease with which the program can be used and its output interpreted 

B.2 COMPARISON OF UNCONSTRAINED METHODS 

A number of studies have been made to evaluate the various unconstrained minimization 
methods. Mor6, Garbow, and Hillstrom [B.l] provided a collection of 35 test functions 
for testing the reliability and robustness of unconstrained minimization software. The 
performance of eight unconstrained minimization methods was evaluated by Box [B.2] 
using a set of test problems with up to 20 variables. Straeter and Hogge [B.3] compared 
four gradient-based unconstrained optimization techniques using two test problems. 
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A comparison of several variable metric algorithms was made by Shanno and Phua 
[B.4], Sargent and Sebastian presented numerical experiences with unconstrained min- 
imization algorithms [B.5]. On the basis of these studies, the following general con- 
clusions can be drawn. 

If the first and second derivatives of the objective function (/) can be evaluated 
easily (either in closed form or by a finite-difference scheme), and if the number of 
design variables is not large ( n < 50), Newton’s method can be used effectively. For 
n greater than about 50, the storage and inversion of the Hessian matrix at each stage 
becomes quite tedious and the variable metric methods might prove to be more useful. 
As the problem size increases (beyond n = 100 or so), the conjugate gradient method 
becomes more powerful. 

In many practical problems, the first derivatives of / can be computed more accu- 
rately than the second derivatives. In such cases, the BFGS and DFP methods become 
an obvious choice of minimization. Of these two, the BFGS method is more stable 
and efficient. If the evaluation of the derivatives of / is extremely difficult or if the 
function does not possess continuous derivatives, Powell's method can be used to solve 
the problem efficiently. 

With regard to the one-dimensional minimization required in all the unconstrained 
methods, the Newton and cubic interpolation methods are most efficient when the 
derivatives of / are available. Otherwise, the Fibonacci or the golden section method 
has to be used. 


B.3 COMPARISON OF CONSTRAINED METHODS 

The comparative evaluation of nonlinear programming techniques was conducted by 
several investigators. Colville [B.6] compared the efficiencies of 30 codes using eight 
test problems that involve 3 to 16 design variables and 0 to 14 constraints. However, 
the codes were tested at different sites on different computers and hence the study was 
not considered reliable. Eason and Fenton [B.7] conducted a comparative study of 20 
codes using 13 problems that also included the problems used by Colville. However, 
their study was confined primarily to penalty function-type methods. Sandgren and 
Ragsdell [B.8] studied the relative efficiencies of the leading nonlinear programming 
methods of the day more systematically. They studied 24 codes using 35 problems, 
including some of those used by Colville and Eason and Fenton. 

The number of design variables varied from 2 to 48 and the number of constraints 
ranged from 0 to 19; some problems involved equality constraints, too. They found 
the GRG method to be most robust and efficient followed by the exterior and interior 
penalty function methods. 

Schittkowski published the results of his study of nonlinear programming codes in 
1980 [B.9]. He experimented with 20 codes on 180 randomly generated test problems 
using multiple starting points. Based on his study, the sequential quadratic program- 
ming was found to be most efficient, followed by the GRG, method of multipliers, 
and penalty function methods, in that order. Similar comparative studies of geomet- 
ric programming codes were also conducted [B.10-B.12]. Although the studies above 
were quite extensive, the conclusion may not be of much use in practice since the 
studies were limited to relatively few methods and further they are limited to specially 
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formulated test problems that are not related to real-life problems. Thus each new prac- 
tical problem has to be tackled almost independently based on past experience. The 
following guidelines are applicable for a general problem. 

The sequential quadratic programming approach can be used for solving a variety of 
problems efficiently. The GRG method and Zoutendijk’s method of feasible directions, 
although slightly less efficient, can also be used for the efficient solution of constrained 
problems. The ALM and penalty function methods are less efficient but are robust and 
reliable in finding the solution of constrained problems. 


B.4 AVAILABILITY OF COMPUTER PROGRAMS 

Many computer programs are available to solve nonlinear programming problems. 
Notable among these is the book by Kuester and Mize [B.13], which gives Fortran 
programs for solving linear, quadratic, geometric, dynamic, and nonlinear programming 
problems. During practical computations, it is important to note that a method that 
works well for a given class of problems may work poorly for others. Hence it is 
usually necessary to try more than one method to solve a particular problem efficiently. 
Further, the efficiency of any nonlinear programming method depends largely on the 
values of adjustable parameters such as starting point, step length, and convergence 
requirements. Hence a proper set of values to these adjustable parameters can be given 
only by using a trial-and-error procedure or through experience gained in working with 
the method for similar problems. It is also desirable to run the program with different 
starting points to avoid local and false optima. It is advisable to test the two convergence 
criteria stated in Section 7.21 before accepting a point as a local minimum. 

More and Wright present information on the current state of numerical optimization 
software in [B.16]. Several software systems such as IMSL, MATLAB, and ACM 
contain programs to solve optimization problems. The relevant addresses are 

IMSL 

7500 Bellaire Boulevard 

Houston, TX 77036 

MATLAB 

The MathWorks, Inc. 

24 Prime Park Way 

Natick, MA 01760 

ACM Distribution Service 

c/o International Mathematics and Statistics Service 

7500 Bellaire Boulevard 

Houston, TX 77036 

In addition, the commercial structural optimization packages listed in Table B.l 
are available in the market [B.14, B. 15]. Most of these softwares are based on a 
finite-element-based analysis for objective and constraint function evaluations and use 
several types of approximation strategies. 
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Table B.l Summary of Some Structural Optimization Packages 


Software system 
(program) 

Source 

(developer) 

Capabilities and 
characteristics 

ASTROS (Automated 

Air Force Wright Laboratories 

Structural optimization with 

STRuctural 

FIBRA 

static, eigenvalue, modal 

Optimization System) 

Wright-Patterson Air Force 
Base, OH 45433-6553 

analysis, and flutter 
constraints; 

approximation concepts; 
compatibility with 
NASTRAN; sensitivity 
analysis 

ANSYS 

Swanson Analysis Systems, 
Inc. 

P.O. Box 65 
Johnson Road 
Houston, PA 15342-0065 

Optimum design based on 
curve-fitting technique to 
approximate the response 
using several trial design 
vectors 

MSC/NASTRAN 

MacNeal-Schwendler Corpo- 

Structural optimization 

MacNeal Schwendler 

ration 

capability based on static, 

Corporation/NAsa 

15 Colorado Boulevard 

natural frequency, and 

STRuctural ANalysis) 

Los Angeles, CA 90041 

buckling analysis; 
approximation concepts 
and sensitivity analysis 

NISAOPT 

Engineering Mechanics 
Research Corporation 
P.O. Box 696 
Troy, MI 48099 

Minimum-weight design 
subject to displacement, 
stress, natural frequency 
and buckling constraints; 
shape optimization 

GENESIS 

VMA Exngineering Inc. 
Manderin Avenue, Suite F 
Goleta, CA 93117 

Structural optimization; 
approximation concepts 
used to tightly couple the 
analysis and redesign 
tasks 


B.5 SCALING OF DESIGN VARIABLES AND CONSTRAINTS 

In some problems there may be an enormous difference in scale between variables 
due to difference in dimensions. For example, if the speed of the engine (n) and the 
cylinder wall thickness (f) are taken as design variables in the design of an IC engine, 
n will be of the order of 10 3 (revolutions per minute) and t will be of the order of 
1 (cm). These differences in scale of the variables may cause some difficulties while 
selecting increments for step lengths or calculating numerical derivatives. Sometimes 
the objective function contours will be distorted due to these scale disparities. Flence it 
is a good practice to scale the variables so that all the variables will be dimensionless 
and vary between 0 and 1 approximately. For scaling the variables, it is necessary to 
establish an approximate range for each variable. For this we can take some estimates 
(based on judgment and experience) for the lower and upper limits on x,(x ; mm and 
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x ; max ), ; = 1,2,...,/!. The values of these bounds are not critical and there will not 
be any harm even if they span partially the infeasible domain. Another aspect of 
scaling is encountered with constraint functions. This becomes necessary whenever the 
values of the constraint functions differ by large magnitudes. This aspect of scaling 
(normalization) of constraints was discussed in Section 7.13. 


B.6 COMPUTER PROGRAMS FOR MODERN METHODS 
OF OPTIMIZATION 

F uzzy Logic Toolbox. Matlab has a fuzzy logic toolbox for designing systems based 
on fuggy logic. Graphical user interfaces (GUI) are available to guide the user through 
the steps of fuzzy interface system design. The toolbox can be used to model complex 
system behaviors using simple logic rules and then implement the rules in a fuzzy 
interface system. Fuzzy optimization can be implemented using fuzzy logic toolbox in 
conjunction with an optimization program such as fmincon. 

Genetic Algorithm and Direct Search Toolbox. The genetic algorithm and direct 
search toolbox, which can be used to solve problems that are difficult to solve with 
traditional optimization techniques, is available with Matlab. The genetic algorithm of 
the toolbox can be used when the function, such as the objective or constraint function, 
is discontinuous, highly nonlinear, stochastic, or has unreliable or undefined derivatives. 
In this toolbox also, graphical user interfaces (GUI) are available for quick setting up of 
problems, selecting algorithmic options, and monitoring progress. Naturally, the options 
of creating initial population, fitness scaling, parent selection, crossover and mutation 
are available in the toolbox. The Matlab optimization programs (using direct search 
methods) can be integrated with the genetic algorithm. 

Neural Network Toolbox. The neural network toolbox is available with Matlab for 
designing, implementing, visualizing and simulating neural networks. The GUI avail- 
able with the toolbox helps in creating, training and simulating neural networks. It 
permits modular network representation to have any number of input- setting layers and 
network interconnection and a graphical view of the network architecture. Optimization 
programs can be used in conjunction with the functions of the neural network toolbox 
to accomplish neural network-based optimization. The neural network toolbox can also 
be used to apply neural networks for the identification and control of nonlinear systems. 

Simulated Annealing Algorithm. An m-file to implement the simulated annealing 
algorithm to solve function minimization problems in the Matlab environment was 
created by Joachim Vandekerckhove. The link is given below: 

http://www.mathworks.com/matlabcentral/fileexchange/10548 

Particle Swarm Optimization. An m-file to implement the particle swarm optimiza- 
tion method in the Matlab environment was created by Wael Korani. The link is given 
below: 

http://www.mathworks.com/matlabcentral/fileexchange/20205 
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Ant Colony Optimization. An m-file to implement the ant colony optimization 
method in the Matlab environment for the solution of symmetrical and unsymmetrical 
traveling salesman problem was created by H. Wang. The link is given below: 

http://www.mathworks.com/matlabcentral/hleexchange/14543 

M ulti objective Optimization. An m-hle to implement multiobjective optimization 

using evolutionary algorithms (based on nondominated sorting genetic algorithm, abbre- 
viated NSGA) in the Matlab environment was created by Arvind Seshadri. The link is 
given below: 

http://www.mathworks.com/matlabcentral/hleexchange/10429 
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Introduction to MATLAB® 


MATLAB, derived from MATrix LABoratory, is a software package that was originally 
developed in the late 1970s for the solution of scientific and engineering problems. The 
software can be used to execute a single statement or a list of statements, called a script 
or m-file. MATLAB family includes the Optimization Toolbox, which is a library of 
programs or m-files to solve different types of optimization problems. Some basic 
features of MATLAB are summarized in this appendix. 


C.l FEATURES AND SPECIAL CHARACTERS 

Some of the important features and special characters used in MATLAB are indicated 
below: 

1 . Symbol This is the default prompt symbol in MATLAB 

2. Symbol ; A semicolon at the end of a line avoids the echoing the 

information entered before the semicolon 

3. Symbol . . . Three periods at the end of a line indicates the continuation of 

the code in the next line 

4. help command_name This displays information on different ways the command can 

be used 

5. Symbol % Any text after this symbol is considered a comment and will 

not be operational 

6. MATLAB is case sensitive. Uppercase and lowercase letters are treated separately. 

7. MATLAB assumes all variables to be arrays. As such, separate dimension statements are 
not needed. Scalar quantities need not be given as arrays. 

8. Names of variables: variable names should start with a letter and can have a length 

of up to 3 1 characters in any combination of letters, digits, and 
underscores. 

9. The symbols for the basic arithmetic operations of addition, subtraction, multiplication, 
division, and exponentiation are +, — , *, /, and A , respectively. 

10. MATLAB has some built-in variable names and, as such, we should avoid using those 
names for variables in writing a MATLAB program or m-file. Examples of built-in names: 
pi (for 7T ), sin (for sine of an angle), etc. 
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C.2 DEFINING MATRICES IN MATLAB 

Before performing arithmetic operations or using them in developing MATLAB pro- 
grams or m-files, the relevant matrices need to be defined using statements such as the 
following. 

1. A row vector or 1 x n matrix, denoted A, can be defined by enclosing its 
elements in brackets and separated by either spaces or commas. 


Example: A = [1 2 3] 

2. A column vector or n x 1 matrix, denoted A, can be defined by entering its 
elements in different lines or in a single line using a semicolon to separate them 
or in a single line using a row vector with a prime on the right-side bracket (to 
denote the transpose). 

Example: [ 1 

A = 2 , A = [1; 2; 3], or A = [1 2 3]'. 

3] 


3. A matrix of size m x n, denoted A, can be defined as follows (similar to the 
procedure used for a column vector). 

Example: [1 2 3 

A = 4 5 6 , or A = [1 2 3; 4 5 6; 7 8 9]. 

7 8 9] 


4. Definitions of some special matrices: 


A = eye (3) 

implies an identity matrix of order 3: A 


1 0 0 

0 10 . 
0 0 1 


A = ones (3) 

implies a square matrix of order 3 with all elements equal to one: A = 


1 1 1 
1 1 1 
1 1 1 


A = zeros (2, 3) 

implies a 2 x 3matrix with all elements equal to zero: A = 


5. Some uses of the colon operator (:): 

(i) To generate all numbers between 100 and 50 in increments of —7 


> > 100 : -7 : 50 


This command generates the numbers 100 93 86 79 65 58 51 
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(ii) To generate all numbers between 0 and n in increments of n/6 

> > 0 : pi /6 : pi 

This command generates the numbers 
0 0.5236 1.0472 1.5708 2.0944 2.6180 3.1416 


C.3 CREATING m-FILES 

MATLAB can be used in an interactive mode by typing each command from the 
keyboard. In this mode, MATLAB performs the operations much like an extended cal- 
culator. However, there are situations in which this mode of operation is inefficient. 
For example, if the same set of commands is to be repeated a number of times with dif- 
ferent values of the input parameters, developing a MATLAB program will be quicker 
and efficient. 

A MATLAB program consists of a sequence of MATLAB instructions written 
outside MATLAB and then executed in MATLAB as a single block of commands. 
Such a program is called a script file, or m-file. It is necessary to give a name to the 
script file. The name should end with .m (a dot followed by the letter m). A typical 
m-file (called fibo.m) is 

file "fibo.m" 

% m-file to compute Fibonacci numbers 
f = [ 1 1 ] ; 

i=l; 

while f (i) +f (i + 1) <1000 
f (i+2)=f (i)+f (i+1) ; 
i=i+l ; 

end 


C.4 OPTIMIZATION TOOLBOX 

The Optimization Toolbox includes programs or m-files that can be used to solve 
different types of optimization problems. The following publication gives information 
on the optimization toolbox, including algorithms and examples for different programs: 

T. F. Coleman, M. A. Branch, and A. Grace, Optimization Toolbox — for Use with 

MATLAB , User’s Guide, Version 2, Math Works, Inc., Natick, MA, 1999. 

The use of any program or m-file in the optimization toolbox requires the following: 

• Selecting the appropriate program or m-file to solve the specific problem at hand. 

• Formulation of the optimization problem in the format expected by MATLAB. In 
general, this involves stating the objective function in a specific form such as a 
“minimization” type and the constraints in a specific form such as “less than or 
equal to zero” type. 
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• Distinction between linear and nonlinear constraints. 

• Identification of lower and upper bounds on design variables. 

• Setting/changing the parameters of the optimization algorithm (based on the avail- 
able options). 

Using MATLAB Programs. Each program or m-file in MATLAB can be imple- 
mented in several ways. The details can be found either in the reference given above 
or online using the help command. For illustration, the help command and the response 
for the program fmincon are shown below. 

The function fmincon can be used in 12 different ways as indicated below (by 
the help command). The differences depend on the available data in the mathematical 
model of the problem and the information required from the solution of the problem. 
In using the different function calls, any data missing in the mathematical model of 
the optimization problem need to be indicated using a null vector as [ ]. Note that the 
response is edited for brevity. 

>> help fmincon 

FMINCON Finds the constrained minimum of a function of 
several variables . 

FMINCON solves problems of the form: 
min F (X) subject to: 

A*X <= B, Aeq*X = Beq (linear constraints) 

C (X) <= 0, Ceq(X) = 0 (nonlinear constraints) 

LB <= X <= UB 

X=FMINCON (FUN, X0 , A, B ) 

X=FMINCON (FUN, X0 , A, B, Aeq, Beq) 

X=FMINCON (FUN, X0 , A, B, Aeq, Beq, LB, UB) 

X=FMINCON (FUN, X0 , A, B, Aeq, Beq, LB, UB, NONLCON) 

X=FMINCON (FUN, X0, A, B, Aeq, Beq, LB, UB, NONLCON, OPTIONS) 

X=FMINCON (FUN, X0, A, B, Aeq, Beq, LB, UB, NONLCON, OPTIONS, . . . 

PI , P2 , . . .) 

[X, FVAL] = FMINCON (FUN,X0,...) 

[X, FVAL, EXITFLAG] = FMINCON (FUN,X0,...) 

[X, FVAL, EXITFLAG, OUTPUT ] =FMINCON (FUN, X0, . . . ) 

[X, FVAL, EXITFLAG, OUTPUT, LAMBDA] =FMINCON (FUN, X0, . . . ) 

[X, FVAL, EXITFLAG, OUTPUT, LAMBDA, GRAD] =FMINCON (FUN,X0, . . .) 

[X, FVAL, EXITFLAG, OUTPUT, LAMBDA, GRAD, HESSIAN] =FMINCON 
(FUN, X0, . . . ) . 

The solution of representative constrained nonlinear programming problems using the 
function fmincon is illustrated in Chapters 1 and 7. 
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CHAPTER 1 


1.1 Min. / = 5 xa — 80 xb + 160xc + 15x£, 0.05 x, 4 + 0.05x# + O.lxc + 
0.15 xd < 1000, O.Ixa + 0.15xg + 0.2xc + 0.05 xd < 2000, 0.05xx + O.lxg + 
O.lxc + 0.15 xd < 1500, xa > 5000, xb >0, xc > 0, xd > 4000 


1.2(a) X* = {0.65, 0.53521} (b) X* = {0.9, 2.5} 

(c) X* = {0.65, 0.53521} 1.5 x* = x* = 300 

1.9(a) R * = 4.472, R* = 2.236 (b) R* = 3.536, R* = 3.536 

(c) R* = 6.67, R* = 3.33 

1.11(a) yi = In x \ , y 2 = In x 2 , In / = 2y x + 3y 2 


(b) / = 10- V2 * 2 , x\ = lO^ 2 , In (log j 0 /) = In (log 10 xi) + In x 2 


1.14 x, = 1 if city j is visited immediately after city i, and = 0 otherwise. 

n n n 

Find {x;j } to minimize f = Y E d x,, subject to Y x ij = 1 (i = 1, 2, . . . , n), 

i = 1 1 = 1 i=l 

n 

i j and Y x ij — 1 0 — 1, 2, . . . , n), j ^ i 
1=1 


1.19 Min. / = plbd, 


p y 


6P x l ^ Py 
bd 2 


6 P x l n~Ed~ 

< 


b > 0.5, b < 2d. 


bd ' bd 2 ~ bd bd 2 ~ 48 l 2 
1.25 Max. / = | t m + | t d , t m +t d < 40, t d > 1.25 t m , 0 < t m <24, 0 < t d < 20. 

1.29 Min. / = 7rx3[x 2 — (x x — x 2 ) 2 ] + |7r[x 3 — (x\ — X4) 3 ], ttx 3 (xi — x 2 ) 2 + 


^7r(x] — X 4) 3 


4,619,606 < 0, x 2 


pRo 

S.e + OAp 


<0, X4 


pRo 

S.e + 0.8 p 


<0 


CHAPTER 2 


2.1 r* — R 2.3 x* — 1.5 (inflection point) 
2.5 x = — 1 (not min, not max), x = 2 (min) 


2.9 d = 



1/4 


2.10 35.36 m 2.11(a) 79.28° (b) 0.91 1 from end of stroke 
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2.13 positive semidefinite 2.15 positive definite 

2.17 negative definite 2.19 indefinite 
2.21 x* = 0.2507 m, x* = 5.0879 x 10“ 3 m 
2.23 a = 328, b = -376 2.26 x* = 27, y* = 21 
2.27 x* — 100 2.28(a) minimum (b) minimum 
(c) saddle point (d) none 2.30 saddle point at (0, 0) 

2.33 dx i = arbitrary, dx 2 — 0 2.36 radius = 2r/3, length = h/3 


2.41 x* = x* = (5/3) 1 / 2 , x* = (S/12) 1 / 2 

2.43 d* = \{{a+b)~ si a 2 - ab + b 2 } 2.47 200 mm x 250 mm 

2.50 X* = {4,2,2} 2.53 198.43 ft x 113.39ft 

2.55(a) / n * ew = 15 tt (b) / n * ew = 18 tt 2.57(a) /* = 1/3 

(b) /* = — 1/9 2.61 X 2 is local minimum 

2.63(a) Kuhn-Tucker conditions satisfied 

(b) A.! = 0.4, A 2 = 0.2, k 3 = 0 2.65(a) S = (1, -3} (b) none 

2.67 optimum 2.69 x* = j, x* = 4yf 2.73 convex 2.75 none optimum 

CHAPTER 3 

3.3 xi = 1, X 2 = 2, x 3 = 3 3.5 x\ =2, X 2 — 4, xj — 6 

3.7 x* = 1/3, x* = 4/3 3.9 x* = 2§, x* = 1± 

3.12 x* = 3-pj-, y* = 3^ 3.15 = 5^, y* = 1^ 

3.17 all points on line joining (2, 10) and (7.4286, 15.4286) 

3.18 x* = 10, y* = 18 3.20 x* = 9/7, y* = 40/7 

3.23 x* = 6, y* = 1 3.25 x* = 6, y* = 0 

3.27 x* = 75/8, y* = 27/8 3.29 x* =3, y* = -2.5 

3.31 x* = 4, y* = 0 3.33 unbounded 3.35 x* = 4/7, y* = 30/7 

3.37 x* = 36/7, y* = 15/7 3.39 x* = 16/5, y* = 1/5 

3.41 infeasible 3.43 unbounded 

3.48 x* = 3000.0, x 2 * = 416.7, x 3 * = 1200.0 

3.50 x* (barley) = 40. x 2 = x 3 = x^ = 0, x| (leased) = 160 


2.38 length = (a 2 / 3 + 7> 2 / 3 ) 3 / 2 



Answers to Selected Problems 797 


3.55 x* = 1.5, Xg = 0 3.57 x* = 16, x* d = 20 
3.60 x* = 36/11, y* = 35/11 

3.66 all points on the line joining (7.4286, 15.4286) and (10, 18) 

3.71 x* = 3.6207, y* = 8.4483 3.75 x* = 2/7, y* = 30/7 
3.79 x* = 56/23, y* = 45/23 3.85 x* = -4/3, y* = 7 
3.89 x* = 0, y* = 3 

3.92 (xi, X2) = amounts of mixed nuts (A , S ) used, lb. xf = 80/7, x* — 120/7 
3.94 x* = 62.5, x* B = 31.25 

3.96 Xi — number of units of P, produced per week, xf = 100/3, x| = 250/3 

3.99 (xi, X2) = number of units of (1 , J ) sold per month, xf = 19.17, xf = 45 

3.102 Xi = number of days used in a month for process type i (i = 1. 2, 3, 4). 
x* = 30, x* = x* = xf = 0 

CHAPTER 4 

4.1 X* = {2.333, 1.333,0,0} 

4.3 x* — 0, r = 1,2,3, x\ = 2/5, x| = 4/5 4.5 solution unbounded 

4.9 x* =0, i = 1, 2, 5, 6, 7, x 3 * = 0.5, x\ = 1.5 

4.12 x* = 2.35, x* =0.1, x* = 2.7, x 4 * = 1.2 

4.15 x* = x| = x* = x* = 0, x* = 120, x| = 100 

4.17 optimum solution remains same, / n * ew = —27,600/3 

4.19 (xi, X2, X3, X4) = number of units of products (A, B, C, D) produced, 
x* = 4000/3, xf = x* = 0, x 4 * = 200/3 

4.23 x* = 1000/3, x* = x 3 * = 0, x* = 800/3 

4.29 x* =0, x| = 0.5 4.31 xf =0, x* = 0.5 

4.33 inbnite solutions 4.35 x*=0, x|=0.5 

4.37 X (2) = {0.3367,0.3112,0.3250} 

4.40 xf = 0.9815, x* = 1.2323, x* = 0.4471 

CHAPTER 5 

5.2 0.484 5.3 0.481 5.4 0.49 5.6 0.8 5.9 0.7817 

5.11(a) 0.786151 (b) 0.786142 (c) 0.786192 5.14(a) 999 
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(b) 20 (c) 19 (d) 14 (e) 14 5.17(a) 2.7814 

(b) 2.7183 5.18(a) 2.7183 (b) 2.7289 (c) 2.7183 

5.20 0.25 5.21 0.001257 5.22 0.00126 5.24 0.00125631 

CHAPTER 6 

6.1 Min. / = Po(0.5«| + OSu^ — mim 2 — m 2 ) 

6.2 f\ = 7.0751, h = 74.8087 where f = 

6.4 x j = 65.567, x 2 = 52.974 6.5 x* = 4.5454, xf = 5.4545 

6.7 / = 4250x^ — 1000xix 2 — 2500 xiX 3 + 1500x| — 500x 2 X3 + 5750x| 

- lOOOxr - 2000x 2 - 3000x3, X* = (0.3241 . 0.8360. 0.3677} 

6.9 X* « {1, 1} 6.12 X* = {0.9465, 2.0615, 2.9671} 

6.14 f(zuz 2 ) = -5 + 1.0429zi - 0.7244 j 2 + 0.5 zj + 0.5 z 2 2 

6.16(a) yes (b) no 6.19(a) 60,002.0 (b) 241.3729 

6.30 Xi = {2, -1, -8} X 2 = {2, -0.7, -8} X 3 = {2.26, -0.85, -8} 

X 4 = {2.15, -0.74, -7.755} 6.35 X 2 = {5.57, 0}, f 2 > fi 

6.38 x* = 1, x* = 1 6.45 X 5 = {2.0869, 1.7390}, f 5 = -8.3477 

6.47 X* = {—2, 1, 4} 6.48 x* = 1.1423, y* = 0.8337 

6.50 x* = 1.698105, x* = 0.883407 6.52 X* = {5, -8} 

6.55(a) no (b) yes 

CHAPTER 7 

7.1 X* = {2, 3}, /* = -50 

7.6(a) Min. / = 12x^ + 30x, — 8 xix 2 — 22xi + 60x 2 — 78, x 2 + 2 = 0, 
x\ +x 2 < 0 

(b) Min. / = 18xi — 68x 2 — 70, x 2 + 2 = 0, x\ + x 2 < 0 

7.8 X* = {1.74558, 1.95265}, /* = -9.23478 

7.11 Max. / = 3.5483 d 4 w, 2.2227 x 10 ~ 6 d 4 - 1 < 0, 0.2223 d 2 w - 150 < 0, 
d < 25 7.13 — 8ii + 4 ,s 2 <0, si + 2 s 2 < 0, — si < 0 

7.15 X* = {0.75,4.56249}, f* = 0.25391 

7.18 X* = {3, 3}, /* = 18 7.21 x* = 24 cm, x* = x| = 12cm 

7.23(a) ct> k = 2x - r k ( — ^ 1 — — ^ ), 

\2 — x x — 10/ 
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(b) <p k =2x + r k ({ 2 - x) 2 + (x - 10) 2 ) 

7.27 x* = 0.989637, x* = 1.979274 
7.29 ±x 2 + ±x 2 - 1 < 0, jci/ 5 + x 2 /3 - 1 < 0, n = 1.5 
7.31 x* = 4.1, x* = 5.9 7.34 X* « {0.8984, 0}, /* « 2.2079 
7.36 X* « {1.671, 17.6} 7.39 x, = 0.4028, x 2 = 0.8056 

1 

7.42 optimum, A.j = a 2 = — — , A. 3 = 1 1 

4V2 

7.45 X* « {1.3480, 0.7722, 0.4299}, /*« 0.1154 

CHAPTER 8 

8.1 / > 2.268866 8.2 / > 3.464102 8.3 / > 3 

8.5 radius = 0.4174m, height = 1.6695m 

8.6 radius = 0.3633 m, height = 2.9067 m 

8.7 x l = 1.5 x 10 6 , x* = 1.0 x 10 6 

8.9 x* = 5.7224, x| = 0.8737, x 3 * = 7.2813 

8.10 x* = 1.0845, x* = 1.1761 

8.11 x* = 8.6365, x* = 0.9397, x* = 6.8219, xj = 0.9609 

8.12 x* = 1.1262, x* = 1.1945, x 3 * = 1.6575 

8.13 x* = 2.2629, x* = 7.1689, x 3 * = 4.5850 

8.14 x* = 0.3780, x* = 0.5345, x 3 * = 0.5714 

8.17 d* = 0.002808 m, D* = 0.02935 m 

8.18 V* = 323.3201 ft/min, F* = 0.005 in/rev 8.20 2 

8.22 R* = 0.2118, L* = 0.2907 

8.23 R* = 1.2821, L* = 0.5266, f* = 16.2056 

CHAPTER 9 

9.1 x* = 2, x* = x 3 * = 0, x* A = 3 9.2 A-H-f-M-I-f 

9.3 ni = 2, « 2 = 3, « 3 = 1 9.4 24,000ft at J , C , D , and E 

9.5 D ■ J -I -I -/-/•* 

9.6 stage 1 (0, n), stage 2 (0, 2n/3), stage 3 {An/9, 0) 

9.7 A Bi Ci £>i C 9.9 units invested in stations 1, 2, 3: (0, 2, 1) 
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9.10 x* = 7.5, x\ = 10.0 9.11 x* = 60, = 70, x 3 * = 80 

9.13 x* — 5, x\ = 0, xf = 5, x\ = 0 


CHAPTER 10 

10.1 X* = {2, 1}, /* = 13 10.3 X* = {0, 9}, /* = 27 

10.4 X* = {1 , 0}, /* = 3 10.5 X* = {0, 3), /* = 3 

10.6 X* = {3, 3}, /* = 39 10.7 X* = {4, 3}, /* = 10 

10.8 187 = 10 1110 11 10.9 X* = {1,2, 0}, /* = 3 

10.12 X* = (1, 1, 1), f* = 18 10.13 X* = {1, 1, 1, 1,0}, /* =9 

10.15 X* = {4, 0}, /* = 4 10.16 X* = {2, 2.5}, f* = 20.5 

CHAPTER 11 


11.4 a = 769.2308, /x x = 1, ctx = 0.048038 

11.7 f x (x) = x + l.5x 2 , f Y (y) =y+ 1.5y 2 

11.8 a x — 0.006079 cm, rejects = 1 .32% 11.9 independent 
11.10 dependent 11.11(a) 0.99904, (b) 0.0475, 

(c) 3616kg f /cm 2 11.12 0.6767 

11.13 R = 268.9520ft, a R = 56.1941 ft, 7? seC ond order = 270.1673 ft 

11.15 X* = {0,0,0, 12}, f* = 12 

11.17(a) X* = {0.0, 36.93, 174.40}, f* = 1,891.72 

(b) X* = same as in (a), aj = 524.50 

(c) X* = same as in (a), (/ + tr/)* = 2,416.22 

CHAPTER 12 

12.3 x(t ) = cie r + (2 — ci)e - ' — t where c\ is a constant 

12.4 circle of radius L/(2n) 
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CHAPTER 13 


13.1 Before: Xi = j ^ 
13.3 (a) 9, (b) 10, (c) 11 



After: X, = 



9 

13 


13.4 10 13.6 (i j ) = (i 4) 13.8 x* = 2 


13.9 xj(2) =2.8297, x 2 (2) = 1.9345, x 3 (2) = 1.6362, x 4 (2) = 1.1887 


13.12 Number of copies of strings 1, 2, 3, 4, 5, 6, 7 are 0, 0, 1, 2, 5, 2, 2, respectively 
13.14 String length = 37 


CHAPTER 14 


14.1 c* = 0.04, c* = 0.81 


14.3(a) {0.001 165, 0.002329, 0.03949, -0.05635), 

(b) {0.0009705, 0.001941, 0.05273, -0.084102), 

(c) {0.0009704, 0.001941, 0.05265, -0.08395} 


14.5 


(937 
1 9*1 


{-0.000582, -0.001165, -0.002329, 0.002329} 


14.7 


(937 
1 9x 3 


= {0.4693 x 10~ 7 , 0.9477 x 10“ 7 , -0.027948,0.027947} 


14.9(a) 


(0.000125 

(0.000458 


(b) 


(-0.000229 

(0.0 

(-0.000229 

’ (0.000333 


(c) 


(-275 

|° 

1 o 

’ (200 


14.11 

3Y 2 
3A 2 1 


37.] 

dAi 


2.28840, 


dX 2 

9A 2 


(0.698492 x 10“ 8 
(0.883790 x 10” 2 


46.8649, 


3Yi 

9A 2 


(-0.312639 x 10- 12 
l 0.391666 x 10“ 6 


14.15 — - = —1 .584664, — ^ = -2.744719 

3 D 3 D 

14.16 y* = 3, A* = 0.316228 x 10“ 7 , A* = 0.948683 x 10“ 7 , f* = 0.6 x 10“ 6 
14.18 y* = 0.25, A{ = 1.0, A* = 1.0, /* = 43.7565 


14.20 X* = {0.7635, 1.0540}, /* = 187.5670 with / = 0.625 /j + 1061.0 f 2 

14.21 X* = {0.8, 1.1}, F* = 3.1267 

14.22 X* = {0.75, 1.25} 
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A 

Absolute minimum, 63 
Active constraint, 8, 94 
Addition of constraints, 218 
Addition of new variables, 214 
Additive algorithm, 605 
Adjoint equations, 682 
Adjoint variable, 682 
Admissible variations, 78 
All-integer problem, 588 
Analytical methods, 253 
Answers to selected problems, 795 
Ant colony optimization, 3, 693, 714 
algorithm, 717 
ant searching behavior, 715 
basic concept, 714 
evaporation, 716 
path retracing, 715 
pheromone trail, 715 
pheromone updating, 715 
Applications of geometric programming, 
525 

Approximate mean, 642 
Approximate variance, 642 
Arithmetic-geometric inequality, 500 
Artificial variables, 139 
Augmented Lagrange multiplier method, 
459 

equality-constrained problems, 459 
inequality-constrained problems, 462 
mixed equality-inequality-constrained 
problems, 463 

Augmented Lagrangian function, 460 
Availability of computer programs, 786 
Average, 635 

B 

Balas algorithm, 604 
Balas method, 589, 604 
Barrier methods, 433 


Basic feasible solution, 131, 136 
Basic set operations, 724 
Basic solution, 130, 136 
Basic variables, 136 
Basis, 130 

Basis vector approach, 743 

Beale’s function, 365 

Beam-column, 55 

Bearing, 531 

Behavior constraints, 7 

BFGS formula, 353 

BFGS method, 360 

Bias of random directions, 312 

Binary numbers, 607 

Binary programming, 624 

Binary variables, 607 

Bivariate distribution, 639 

Boltzmann’s constant, 703 

Boltzmann’s probability distribution, 

703 

Boundary value problem, 549 
Bounded objective function method, 764 
Bound point, 8 

Brachistochrone problem, 671 
Bracket function, 443, 696 
Branch and bound method, 609 
Branching, 610 

Brown’s badly scaled function, 365 
Broydon-Fletcher-Goldfarb-Shanno 
method, 304, 360 

C 

Calculus methods, 3 

Calculus of variations, 3, 668 

Canonical form, 133 

Cantilever beam, 527 

Cauchy method, 304, 339 

Cauchy’s inequality, 500 

Central limit theorem, 647 

Chance constrained programming, 647 
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Change in constraint coefficients, 215 
Change in cost coefficients, 212 
Change in right hand side constants, 208 
Characteristics of constrained 
problem, 380 
Choice of method, 784 
Circular annular plate, 690 
Classical optimization techniques, 63 
Classification: 

of optimization problems, 14 
of unconstrained minimization 
methods, 304 

Classification of optimization problems 
based on: 

deterministic nature of variables, 29 
existence of constraints, 14 
nature of design variables, 15 
nature of equations involved, 19 
number of objective functions, 32 
permissible values of design 
variables, 28 

physical structure of the problem, 16 
separability of the functions, 30 
Closed half space, 128 
Cluster analysis, 3 
Coefficient of variation, 636 
Collapse mechanism, 121 
Comparison: 

of constrained methods, 785 
of elimination methods, 272 
of methods, 294 
of unconstrained methods, 784 
Complementary geometric 
programming, 520 
degree of difficulty, 523 
solution procedure, 522 
Complement of a fuzzy set, 724 
Complex method, 384 
Composite constraint surface, 8 
Computational aspects of optimization, 
784 

Computer programs, availability of, 

786, 788 

Computer program for: 

ant colony optimization, 789 
fuzzy logic toolbox, 788 


genetic algorithm and direct search 
toolbox, 788 

modem optimization methods, 788 
multiobjective optimization, 789 
neural network toolbox, 788 
particle swarm optimization, 788 
simulated annealing algorithm, 788 
Concave function, 779 
Concept of cutting plane, 591 
Concept of suboptimization, 549 
Concrete beam, 29 
Condition number of a matrix, 306 
Cone clutch, 49, 528 
Conjugate directions, 319 
Conjugate gradient method, 341, 355, 361 
Consistency condition, 220 
Constrained minimization (GMP), 508 
Constrained optimization problem, 6, 380 
characteristics, 380 

Constrained optimization techniques, 380 

Constrained variation, 77, 79 

Constraint qualification, 98 

Constraint surface, 8 

Contact stress between cylinders, 297 

Contact stress between spheres, 250 

Continuous beams, 576 

Continuous dynamic programming, 573 

Continuous feasible solution, 609 

Continuous random variable, 634 

Contours of objective function, 10 

Contraction, 332 

Contraction coefficient, 332 

Control variables, 16 

Control vector, 678 

Convergence of constrained problems, 464 
Convergence of order p, 305 
Conversion of final to initial value 
problem, 566 

Conversion of nonserial to serial 
system, 548 
Convex: 

function, 779 
polygon, 127 
polyhedron, 127, 129 
polytope, 129 

programming problem, 98, 104, 442 
set, 129 
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Cooling fin, 675 
Correlation, 640 
Correlation coefficient, 640 
Correlation matrix, 646 
Covariance, 640 
Covariance matrix, 648 
CPM and PERT, 3 
Crane hook, 60 
Crisp set theory, 723 
Criterion function, 9 
Critical points, 750 
Crossover, 699 

Cubic interpolation method, 253, 280 
Cumulative distribution function, 634 
Curse of dimensionality, 573 
Curve of minimum time of descent, 672 
Cutting plane method, 390, 589 
algorithm, 387 

geometric interpretation, 388 
Cyclic process, 331 
Cylinders in contact, 297 

D 

Darcy-Weisbach equation, 663 
Darwin’s theory, 694 
Davidon-Fletcher-Powell method, 304, 

354 

DC motor, 53 
Decision variables, 6 
Decomposition principle, 177, 200 
Degenerate solution, 142 
Degree of difficulty, 496 
Derivatives 

of eigenvalues and eigenvectors, 747 
of static displacements and stresses, 745 
of transient response, 749 
Descent direction, 336 
Descent methods, 304, 335 
Design constraints, 7 
Design equations, 547 
Design of: 

cantilever beam, 527 
column, 11 
cone clutch, 528 
continuous beams, 576 
drainage system, 579 
four bar mechanism, 535 


gear train, 579 
helical spring, 529, 659 
hydraulic cylinder, 527 
lightly loaded bearing, 531 
planar truss, 249 
two bar truss, 533 
Design of experiments, 3 
Design point, 7 
Design space, 7 
Design variable linking, 738 
Design variables, 6 
Design vector, 6 
DFP formula, 352 
DFP method, 354 
Dichotomous search, 253, 257 
Differential calculus methods, 253, 493 
Differential of f 68 
Direction finding problem, 395 
Direct methods, 380, 383 
Direct root method, 253, 286 
Direct search methods, 309 
Direct substitution, 76 
Discrete programming problem, 588 
Discrete random variable, 634 
Discriminate analysis, 3 
Drainage system, 579 
Dual function, 501 
Duality in linear programming, 192 
Duality theorems, 195 
Dual problem, 192, 509 
Dual simplex method, 195 
Dynamic optimization problem, 16 
Dynamic programming, 3, 544 
applications, 576 
calculus method of solution, 555 
computational procedure, 553 
continuous, 573 

conversion of final to initial value 
problem, 566 

problem of dimensionality, 572 
recurrence relation, 551 
tabular method of solution, 560 

E 

Electrical bridge network, 52 
Elementary operations, 133 
Elimination methods, 253, 254 
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Elimination methods -comparison, 

271 

Engineering applications of 
optimization, 5 

Engineering optimization literature, 35 
Equality constraints, 6 
Euler equation, 671 
Euler-Lagrange equation, 671 
Evaluation of gradient, 337 
Event, 633 

Exhaustive search, 253, 256 
Expansion, 331 
Expansion coefficient, 332 
Expected value, 635 
Experiment, 633 

Extended interior penalty function, 45 1 
Exterior penalty function method, 

443, 455 
algorithm, 445 
convergence proof, 447 
mixed equality-inequality 
constraints, 455 
parametric constraints, 459 
Extrapolation of design vector, 448 
Extrapolation of objective function, 

450 

Extrapolation technique, 447 
Extreme point, 130 

F 

Factor analysis, 3 

Failure mechanisms of portal frame, 246 

Fast reanalysis techniques, 740 

Fathomed, 610 

Feasible direction, 95, 393 

Feasible direction methods, 393, 394 

Feasible solution, 130 

Feasible space, 8 

Fibonacci method, 253, 263 

Fibonacci numbers, 263 

Final interval of uncertainty, 263, 272 

Final value problem, 548 

First level problem, 757 

First order methods, 305 

Fitness, 696 

Fletcher and Powell’s helical valley 
function, 364 


Fletcher-Reeves method, 304, 341 
algorithm, 343 
Floor design, 542 
Flow chart: 

for augmented Lagrange multiplier 
method, 461 

for cubic interpolation method, 284 
for Fibonacci search method, 266 
for linear extended penalty function 
method, 453 

for parallel simulated annealing, 762 
for Powell’s method, 325 
for simplex algorithm, 143 
for simplex method, 153 
for simulated annealing, 706 
for two-phase simplex method, 153 
Flywheel design, 482 
Forced boundary conditions, 671 
Four bar mechanism, 535 
Four bar truss, 60, 557 
Free boundary conditions, 671 
Free point, 8 

Freudenstein and Roth function, 364 
Functional, 669 
Functional constraints, 7 
Function of a random variable, 638 
Function of several random variables, 640 
mean, 641 
variance, 641 
Fuzzy decision, 726 
Fuzzy feasible region, 822 
Fuzzy optimization, 3, 693, 722 
computational procedure, 726 
fuzzy set theory, 722 
Fuzzy systems, 722, 725 

G 

Game theory, 3 
Gaussian distribution, 643 
Gear train, 579 
General iterative scheme of 
optimization, 305 

Generalized penalty function method, 619 
Generalized reduced gradient, 415 
Generalized reduced gradient method, 412 
algorithm, 416 

General primal dual relations, 193 


Index 807 


Genetic algorithms, 3, 693, 694, 701 
Genetic operators, 697 
Geometric boundary conditions, 671 
Geometric constraints, 7 
Geometric programming, 3, 22, 492 
applications, 525 

arithmetic-geometric inequality, 500 
complementary geometric 
programming, 520 
constrained problem, 508, 509 
degree of difficulty, 496 
mixed inequality constraints, 518 
normality condition, 495 
orthogonality conditions, 495 
primal dual relations, 501 
unconstrained problem, 493 
Geometry of linear programming 
problems, 124 

Global criterion method, 764 
Global minimum, 63 
Goal programming method, 765 
Golden mean, 270 
Golden section, 270 
Golden section method, 253, 267 
Gomory’s constraint, 592 
Gomory’s cutting plane method, 591 
for all integer problem, 592 
graphical representation, 589 
for mixed integer problem, 599 
Gradient, 95, 335 
Gradient evaluation, 337 
Gradient of a function, 335 
Gradient methods, 335 
Gradient projection method, 404 
algorithm, 409 
Graphical optimization, 10 
Graphical representation, 589 
Grid search method, 304, 314 

H 

Hamiltonian, 682 
Helical spring, 22, 529 
Helical torsional spring, 541 
Hessian matrix, 71, 302 
Heuristic search methods, 381 
Historical development, 3 


Hitchcock-Koopman’s problem, 221 
Hollow circular shaft, 51 
Hopfield network, 729 
Huang’s family of updates, 353 
Hydraulic cylinder design, 527 
Hyperplane, 128 

I 

Identifying optimal point, 140 
111 conditioned matrix, 306 
Improving nonoptimal solution, 141 
Inactive constraint, 94 
Incremental response approach, 740 
Independent events, 633 
Independent random variables, 639 
Indirect methods, 335, 380, 428 
Indirect updated method, 361 
Inequality constraints, 6, 93 
Infeasibility form, 152 
Infinite number of solutions, 148 
Inflection point, 65 
Initial value problem, 548 
Input state variables, 546 
Integer feasible solution, 610 
Integer lattice points, 591 
Integer linear programming, 589 
Integer nonlinear programming, 606 
Integer polynomial programming, 606 
Integer programming, 3, 28, 588 
Interior method, 222 
Interior penalty function method, 

432, 454 

convergence proof, 438 
extrapolation technique, 447 
iterative process, 433 
penalty parameter, 435 
starting feasible point, 434 
Interpolation methods, 253, 271 
Interpretation of Lagrange multipliers, 90 
Intersection of convex sets, 131 
Intersection of fuzzy sets, 725 
Interval halving method, 260 
Interval of uncertainty, 256, 263 
Introduction to optimization, 1 
Inverse update formulas, 353 
Inverted utility function method, 764 
Iterative process of optimization, 252 
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J 

Jacobian, 82 

Joint density function, 639 
Joint distribution function, 639 
Jointly distributed random variables, 639 
Joint normal density function, 646 

K 

Karmarkar’s method, 222 
algorithm, 226 
conversion of problem, 224 
statement of problem, 223 
Kohonen network, 729 
Kuhn-Tucker conditions, 98, 401 
testing, 465 

L 

Lagrange method, 85 
necessary conditions, 86 
sufficiency conditions, 87 
Lagrange multipliers, 85, 675, 682 
Lagrangian function, 86, 230 
Learning process, 728 
Lexicographic method, 765 
Limit design of frames, 120 
Linear convergence, 305 
Linear extended penalty function, 45 1 
Linearization of constraints, 387 
Linearization of objective, 387 
Linear programming, 3, 26, 119, 177 
additional topics, 177 
applications, 120 
definitions, 127 
theorems, 127 
two phases, 150 

Linear programming problem, 26 
as a dynamic programming 
problem, 569 
geometry, 124 
infinite solutions, 126, 148 
matrix form, 122 
scalar form, 122 
standard form, 122 
unbounded solution, 127, 146 
Linear simultaneous equations, 133 
Line segment, 128, 202 
Local minimum, 63 


M 

Machining economics problem, 525 
Marginal density function, 639 
Markov processes, 3 
Marquardt method, 304, 348 
Mathematical programming 
techniques, 1, 3 
MATLAB , 791 
creating m-files, 793 
defining matrices, 792 
features, 791 
introduction, 791 
optimization toolbox, 793 
special characters, 791 
using programs, 794 
MATLAB solutions, 37 

binary programming problem, 624 
constrained optimization problem, 474 
goal attainment method, 767 
interior point method, 235 
linear programming problem, 156 
one -dimensional problem, 294 
quadratic programming problem, 237 
unconstrained optimization 
problem, 365 

Matrix methods of structural analysis, 248 
Maxwell distribution, 663 
Mean, 635 

Membership function, 723 
Merit function, 9 

Method of constrained variation, 77 
Method of Lagrange multipliers, 85 
Methods of feasible directions, 393 
Methods of operations research, 2 
Metropolis criterion, 703 
MIMD architecture, 761 
Minimum cost pipeline, 585 
Minimum drag, 672 
Mixed constraints, 453 

exterior penalty function method, 455 
interior penalty function method, 454 
Mixed equality and inequality 
constraints, 453 
Mixed integer problem, 588 
Model coordination method, 755 
Modern optimization techniques, 3, 4, 693 
Monotonicity, 547 
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Motivation of simplex method, 138 
Multibay cantilever truss, 578 
Multilayer feedforward network, 729 
Multilevel optimization, 755 
Multimodal function, 254 
Multiobjective optimization, 761 
Multiobjective programming, 3, 9, 33 
Multiobjective programming 
problem, 9, 33 

Multiple objective functions, 9 
Multistage decision problem, 544, 548 
Multistage decision process, 545 
Multivariable optimization, 68 
with equality constraints, 75 
with inequality constraints, 93 
necessary conditions, 69, 81, 94, 98 
with no constraints, 68 
sufficiency conditions, 70, 83 
Multivariate distribution, 639 
Mutation, 700 

N 

Natural boundary conditions, 671 
Necessary conditions for optimal control, 
679 

Negative definite matrix, 71 
Network methods, 3 

Neural network based methods, 693, 727 
Neural networks, 3, 728 
Neuron, 728 

Newton method, 253, 286, 304, 345 
Newton Raphson method, 287 
Node, 610 

Nonbasic variables, 136 
Nonconvex sets, 129 
Nondegenerate solution, 131 
Nongradient methods, 304 
Nonlinear programming, 3, 248, 301, 380 
Nonlinear programming problem, 19 
Nonpivotal variables, 135 
Nontraditional optimization techniques, 3 
Normal distribution, 643 
Normality condition, 495 
Normalization condition, 635 
Normalization of constraints, 436 
Normalized beta function integrand, 620 
Norm of a matrix, 306 


Norm of a vector, 305 
Number of experiments, 272 
Numerical integration, 458 

O 

Objective function, 6, 9 
Objective function surfaces, 9 
Offspring, 700 

One degree of difficulty problem, 515, 
526, 532 

One dimensional minimization 
methods, 248 
Operations research, 1, 3 
Optimal basic solution, 131 
Optimal control problem, 16 
Optimal control theory, 668, 678 
Optimality criteria, 683 

multiple displacement constraints, 684 
single displacement constraint, 683 
Optimality criteria methods, 668, 683 
Optimal layout of a truss, 577 
Optimal solution, 131 
Optimization, 1 

Optimization of fuzzy systems, 722, 725 
Optimization problems 
classification, 14 
statement, 6 

Optimization techniques, 3, 35 
Optimization toolbox, 36 
Optimum machining conditions, 525 
Orthogonal directions, 319 
Orthogonality conditions, 495 
Output state variables, 546 
Overachievement, 766 

P 

Parallel processing, 760 
Parallel simulated annealing, 761 
Parameter optimization problem, 15 
Parametric constraint, 456 
Parametric programming, 207 
Parent, 700 

Pareto optimum solution, 763 
Particle swarm optimization, 3, 693, 708 
computational implementation, 709 
inertia term, 710 
inertia weight, 711 
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Particle swarm optimization ( continued ) 
particle, 708 
position, 708 
velocity, 708 
Pattern directions, 318 
Pattern recognition, 3 
Pattern search methods, 304 
Penalty function, 696 
Penalty function method: 
basic approach, 430 
convergence criteria, 435 
convergence proof, 438, 447 
exterior method, 443 
extrapolation, 447 
initial value of parameter, 435 
interior method, 432 
iterative process, 433, 445 
mixed equality and inequality 
constraints, 453 

normalization of constraints, 436 
parametric constraints, 456 
penalty parameter, 431 
starting feasible point, 434 
Penalty parameter, 431 
Performance index, 16, 679 
Perturbing the design vector, 465 
Phase I of simplex method, 151 
Phase II of simplex method, 152 
Pivot operation, 134 
Pivot reduction, 135 
Point in n-dimensional space, 128 
Polynomial programming problem, 

589 

Population, 697 
Positive definite matrix, 71 
Positive semidefinite matrix, 71 
Post optimality analysis, 207 
Posynomial, 22, 492 
Powell’s badly scaled function, 365 
Powell’s method, 304, 319 
algorithm, 323 
convergence criterion, 326 
flow chart, 325 

Powell’s quartic function, 364 
Power screw, 57 

Practical aspects of optimization, 737 
Practical considerations, 293 


Preassigned parameters, 6 
Precision points, 535 
Predual function, 500 
Pressure vessel, 59 
Primal and dual programs, 505, 510 
Primal dual relations, 193, 501 
Primal function, 500 
Primal problem, 192, 509 
Principle of optimality, 549 
Probabilistic programming, 632 
Probability, definition, 632, 633 
Probability density function, 633 
Probability distribution function, 634 
Probability distributions, 643 
Probability mass function, 634 
Probability theory, 632 
Problem of dimensionality, 572 
Projected Lagrangian method, 425 
Projection matrix, 405 
Proportional damping, 749 
Pseudo dual simplex method, 606 

Q 

Quadratic ally convergent method, 319 
Quadratic convergence, 287, 305, 325 
Quadratic extended penalty function, 452 
Quadratic form, 71 

Quadratic interpolation method, 253, 273 
Quadratic programming, 3, 24, 229 
Quasi-Newton condition, 304 
Quasi-Newton method, 253, 288, 350 
Queueing theory, 3 

R 

Railroad track, 48 
Random jumping method, 311 
Random search methods, 304, 309, 383 
Random variables, 633 
Random walk with direction 
exploitation, 313 
Random walk method, 312 
Rank 1 updates, 351 
Rank 2 updates, 352 
Rate of change of a function, 338 
Rate of convergence, 305 
Real valued programming problem, 28 
Reanalysis, 740 
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Reciprocal approximation, 685 
Recurrence relationship, 551 
Reduced basis technique, 737 
Reduction ratio, 265 
Reduction of size, 737 
Refitting, 277 
Reflection, 328 
Reflection coefficient, 330 
Reflection process, 329 
Regression analysis, 3 
Regular simplex, 328 
Relative frequency of occurrence, 636 
Relative minimum, 63 
Reliability theory, 3 
Renewal theory, 3 
Reproduction, 697 
Reservoir pump installation, 505 
Reservoir system, 160 
Return function, 546 
Revised simplex method, 177 
step-by-step procedure, 182 
theoretical development, 178 
Rigid frame, 120 
Rocket in outer space, 16 
Rosenbrock’s parabolic valley 
function, 363 

Rosen’s gradient projection method, 404 
algorithm, 409 

determination of step length, 407 
projection matrix, 405 
Roulette-wheel selection scheme, 698 

S 

Saddle point, 73 
Scaffolding system, 26, 57 
Scaling of constraints, 787 
Scaling of design variables, 305, 787 
Search with accelerated step size, 255 
Search with fixed step size, 254 
Secant method, 253, 290 
Second level problem, 757 
Second order methods, 305 
Semidefinite matrix, 71 
Sensitivity analysis, 207 
Sensitivity equations, 751 

using Kuhn-Tucker conditions, 752 


using the concept of feasible 
direction, 754 

Sensitivity of optimum solution, 751 
Sensitivity to problem parameters, 751 
Separability, 547 
Separability of functions, 31 
Separable function, 31 
Separable programming, 3, 31 
Sequential decision problem, 544 
Sequential linear discrete programming, 
614 

Sequential linear integer programming, 
614 

Sequential linear programming, 387 
geometric interpretation, 388 
Sequential quadratic programming, 422 
derivation, 422 
solution procedure, 425 
Sequential unconstrained minimization, 
431 

Serial multistage decision process, 545 
Shadow prices, 92 
Shell and tube heat exchanger, 51 
Side constraints, 7 
Sigmoid function, 728 
Signum function, 509 
Simplex, 328, 384 
Simplex algorithm, 139 
flow chart, 143 

Simplex method, 138, 304, 328 
flow chart, 153 
two phases, 150 
Simplex multipliers, 180 
Simply supported beam, 58 
Simulated annealing, 3, 693, 702 
Simulation methods, 3 
Simultaneous equations, 133 
Simultaneous search method, 257 
Single stage decision problem, 546 
Single variable optimization, 63 
Slack variable, 123 
Slider crank mechanism, 49 
Solid body of revolution, 672 
Solution by direct substitution, 76 
Solution of linear equations, 133 
Spring-cart system, 71 
Stamping of circular discs, 49 
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Standard deviation, 635, 636 
Standard form of LP problem, 122 
Standard normal distribution, 643 
Standard normal tables, 643, 645 
Starting feasible point, 434 
State inversion, 566 

Statement of an optimization problem, 6 
State transformation, 546 
State variables, 16 
State vector, 679 
Statically determinate truss, 542 
Static optimization problem, 15 
Stationary point, 65 
Stationary values of functionals, 669 
Statistical decision theory, 3 
Statistically independent events, 633 
Statistical methods, 3 
Steepest ascent direction, 335 
Steepest descent method, 304, 339 
convergence criteria, 341 
Step-cone pulley, 19 
Step length determination, 398, 407 
Stochastic process techniques, 3 
Stochastic programming, 3, 29, 632 
geometric programming, 659 
linear programming, 647 
nonlinear programming, 652 
Structural error, 535 
Structural optimization packages, 

787 

Suboptimization, 551 
SUMT, 431 

Superlinear convergence, 305, 361 
Surplus variable, 124 
Survival of the fittest, 696 
Symmetric primal-dual relations, 192 
System reliability, 583 

T 

Taylor’s series expansion, 68, 642 
Tentative solution, 670 
Termination criteria, 401 
Test functions (unconstrained nonlinear 
programming), 363 
Beale’s function, 365 


Brown’s badly scaled function, 365 
Fletcher and Powell’s helical 
valley, 364 

Freudenstein and Roth function, 364 
Powell’s badly scaled function, 365 
Powell’s quartic function, 364 
Rosenbrock’s parabolic valley, 363 
Wood’s function, 365 
Testing for concavity, 779 
Testing for convexity, 779 
Testing Kuhn-Tucker conditions, 465 
Test problems (constrained nonlinear 
programming), 467 
heat exchanger, 473 
speed reducer (gear train), 472 
three -bar truss, 467 
welded beam, 470 
25-bar space truss, 468 
Trajectory optimization problem, 16 
Transformation techniques, 428 
Transformation of variables, 381, 428 
Transportation array, 221 
Transportation problem, 177, 220 
Transportation technique, 221 
Transversality conditions, 682 
Trapezoidal rule, 458 
Travelling salesperson, 52 
Trial, 254 

Truss, 47, 55, 60, 248, 481, 487, 578, 623, 
686, 691, 692, 741, 745, 758, 772 
Tubular column design, 10, 654 
Two-bar truss, 47, 55, 487, 533, 758 
Two degree of difficulty problem, 526 
Two phases of simplex method, 150 
Two stage compressor, 67 
Types of multistage decision 
problems, 548 

U 

Unbounded solution, 127, 142, 146 
Unconstrained minimization (GMP), 493 
Unconstrained optimization problem, 6, 
301 

Unconstrained optimization techniques, 
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Underachievement, 766 
Uniform distribution, 643 
Unimodal function, 253 
Union of fuzzy sets, 725 
Univariate distribution, 639 
Univariate method, 304, 315 
Unrestricted search, 253, 254 
Usable direction, 97 
Usable feasible direction, 393 
Utility function method, 763 

V 

Valuation set, 723 

Variable metric method, 354, 361 

Variance, 636 

Variation, 669 

Variational operator, 669 

Vector minimization problem, 762 

Vector of simplex multipliers, 180 

Venn diagram, 723 

Vertex, 130 


W 

Water resource system, 241 
Water tank design, 550 
Weighting function method, 764 
Weights, 501 

Well conditioned matrix, 306 
Wood’s function, 365 

Z 

Zero degree of difficulty problem, 511, 
525, 531 

Zero-one LP problem, 608 
Zero-one polynomial programming, 608 
Zero-one problem, 588 
Zero-one programming problems, 588, 
604 

Zeroth order methods, 304 
Zoutendijk’s method, 381, 394 
determination of step length, 398 
direction finding problem, 395 
termination criteria, 401 


